I need to implement the html_content dynamic for custom email operator, as we have html_content different for different jobs.
Also, I need the values, for example, rows and filename be dynamic
The example below is one of the email body:
The `filename` has been delivered. `0 rows` for contact from 2020-06-14. If you have any questions or concerns regarding this feed please reply to this email
NOTE: The information contained in this email message is considered confidential and proprietary to the sender and is intended solely for review and use by the named recipient. Any unauthorized review, use, or distribution is strictly prohibited. If you have received this message in error, please advise the sender by reply email and delete the message.
Code:
def execute(self, context):
if self.source_task_ids:
ti = context['task_instance']
self.s3_key = ti.xcom_pull(task_ids=self.source_task_ids, key='s3_key')[0]
self.s3_key = self.get_s3_key(self.s3_key)
s3_hook = S3Hook(self.s3_conn_id)
try:
if not s3_hook.check_for_key(self.s3_key, bucket_name=self.s3_bucket):
logger.info(f'The source key {self.s3_key} does not exist in the {self.s3_bucket}')
rowcount = 0
self.subject = self.subject
self.html_content = self.html_content
else:
filedata = s3_hook.read_key(self.s3_key, bucket_name=self.s3_bucket)
rowcount = filedata.count('\n') - 1
logger.info(f'rowcount: {rowcount}')
self.subject = self.subject
self.html_content = self.html_content
self.snd_mail(self.send_from,self.send_to,self.subject, self.html_content, self.eml_server, files=self.files)
except Exception as e:
raise AirflowException(f'Error in sending the Email - {e}')
Airflow support Jinja templating in operators. It is build into the BaseOperator and controlled by the template_fields and template_ext fields of the base operator, e.g.:
class CustomEmailOperator(BaseOperator):
template_fields = ("html_content")
template_ext = (".html",)
#apply_defaults
def __init__(self, html_content, ...):
super().__init__(*args, **kwargs)
self.html_content = html_content
def execute(self, context):
# Rest of operator code, nothing special needs to happen to render the templates
Now the html_content field can either be a path to a jinja templated file with the .html extension or a html string directly. Parameters can be passed to the Jinja template using the params field of the operator:
task1 = CustomEmailOperator(
task_id = "task1",
html_content = "Hello, {{ params.name }}",
params = {
"name": "John",
},
...
)
That is how you could pass the filename and # of rows parameters. If you do not want to rely on the BaseOperator mechanism to template your email content, e.g. because you need a bit more control you can also use a helper function available in Airflow:
from airflow.utils.helpers import parse_template_string
html_content = "Hello, {{ params.name }}"
_, template = parse_template_string(html_content)
body = template.render({"name": "John"})
Related
TLDR
In the python callable for a simpleHttpOperator response function, I am trying to push an xcom that has combined information from two sources to a specificied key (a hash of the filename/path and an object lookup from a DB)
Longer Tale
I have a filesensor written which grabs all new files and passes them to MultiDagRun to parallel process the information (scientific) in the files as xcom. Works great. The simpleHttpOperator POSTs filepath info to a submission api and receives back a task_id which it must then read as a response from another (slow running) api to get the result. This I all have working fine. Files get scanned, it launches multiple dags to process, and returns objects.
But... I cannot puzzle out how to push the result to an xcom inside the python response function for the simpleHttpOperator.
My google- and SO and Reddit-fu has failed me here (and it seems overkill to use the pythonOperator tho that's my next stop.). I notice a lot of people asking similar questions though.
How do you use context or ti or task_instance or context['task_instance'] with the response function? (I cannot use "Returned Value" xcom as I need to distinguish the xcom keys as parallel processing afaik). As the default I have context set to true in the default_args.
Sure I am missing something simple here, but stumped as to what it is (note, I did try the **kwargs and ti = kwargs['ti'] below as well before hitting SO...
def _handler_object_result(response, file):
# Note: related to api I am calling not Airflow internal task ids
header_result = response.json()
task_id = header_result["task"]["id"]
api = "https://redacted.com/api/task/result/{task_id}".format(task_id=task_id)
resp = requests.get(api, verify=False).json()
data = json.loads(resp["data"])
file_object = json.dumps(data["OBJECT"])
file_hash = hash(file)
# This is the part that is not working as I am unsure how
# to access the task instance to do the xcom_push
ti.xcom_push(key=file_hash, value=file_object)
if ti.xcom_pull(key=file_hash):
return True
else:
return False
and the Operator:
object_result = SimpleHttpOperator(
task_id="object_result",
method='POST',
data=json.dumps({"file": "{{ dag_run.conf['file'] }}", "keyword": "object"}),
http_conn_id="coma_api",
endpoint="/api/v1/file/describe",
headers={"Content-Type": "application/json"},
extra_options={"verify":False},
response_check=lambda response: _handler_object_result(response, "{{ dag_run.conf['file'] }}"),
do_xcom_push=False,
dag=dag,
)
I was really expecting the task_instance object to be available in some fashion, either be default or configuration but each variation that has worked elsewhere (filesensor, pythonOperator, etc) hasn't worked, and been unable to google a solution for the magic words to make it accessible.
You can try using the get_current_context() function in your response_check function:
from airflow.operators.python import get_current_context
def _handler_object_result(response, file):
# Note: related to api I am calling not Airflow internal task ids
header_result = response.json()
task_id = header_result["task"]["id"]
api = "https://redacted.com/api/task/result/{task_id}".format(task_id=task_id)
resp = requests.get(api, verify=False).json()
data = json.loads(resp["data"])
file_object = json.dumps(data["OBJECT"])
file_hash = hash(file)
ti = get_current_context()["ti"] # <- Try this
ti.xcom_push(key=file_hash, value=file_object)
if ti.xcom_pull(key=file_hash):
return True
else:
return False
That function is a nice way of still accessing the task's execution context when context isn't explicitly handy or you don't want to pass context attrs around to access it deep in your logic stack.
I'm using the MySqlHook to show results of my SQL query in Airflow logs. All is working except that I want to limit the results to a particular schema. Even though I'm passing in that schema, MySqlHook doesn't seem to be recognizing it.
Here's my function that uses the hook:
def func(mysql_conn_id, sql, schema):
"""Print results of sql query """
print("schema:", schema)
hook = MySqlHook(mysql_conn_id=mysql_conn_id, schema=schema)
df = hook.get_pandas_df(sql=sql)
print("\n" + df.to_string())
the schema that I pass in does come up in the logs from my print statement.
But when I see the results of the second print statement here (the df.to_string()), the connection that shows up in Airflow is without the schema.
INFO - Using connection to: id: dbjobs_mysql. Host: kb-qa.local, Port: None, Schema: , Login: my_user, Password: ***, extra: {}
And the query is running for more than just the given schema.
Looking in the source code it seems like what I did is what it's expecting:
class MySqlHook(DbApiHook):
conn_name_attr = 'mysql_conn_id'
default_conn_name = 'mysql_default'
supports_autocommit = True
def __init__(self, *args, **kwargs):
super(MySqlHook, self).__init__(*args, **kwargs)
self.schema = kwargs.pop("schema", None)
I need help understanding how to process a user-supplied token in my FastApi app.
I have a simple app that takes a user-session key, this may be a jwt or not. I will then call a separate API to validate this token and proceed with the request or not.
Where should this key go in the request:
In the Authorization header as a basic token?
In a custom user-session header key/value?
In the request body with the rest of the required information?
I've been playing around with option 2 and have found several ways of doing it:
Using APIKey as described here:
async def create(api_key: APIKey = Depends(validate)):
Declaring it in the function as described in the docs here
async def create(user_session: str = Header(description="The Users session key")): and having a separate Depends in the router config,
The best approach is to build a custom dependency using any one of the already existing authentication dependencies as a reference.
Example:
class APIKeyHeader(APIKeyBase):
def __init__(
self,
*,
name: str,
scheme_name: Optional[str] = None,
description: Optional[str] = None,
auto_error: bool = True
):
self.model: APIKey = APIKey(
**{"in": APIKeyIn.header}, name=name, description=description
)
self.scheme_name = scheme_name or self.__class__.__name__
self.auto_error = auto_error
async def __call__(self, request: Request) -> Optional[str]:
api_key: str = request.headers.get(self.model.name)
# add your logic here, something like the one below
if not api_key:
if self.auto_error:
raise HTTPException(
status_code=HTTP_403_FORBIDDEN, detail="Not authenticated"
)
else:
return None
return api_key
After that, just follow this from documentation to use your dependency.
I want to send data from app.post() to app.get() using RedirectResponse.
#app.get('/', response_class=HTMLResponse, name='homepage')
async def get_main_data(request: Request,
msg: Optional[str] = None,
result: Optional[str] = None):
if msg:
response = templates.TemplateResponse('home.html', {'request': request, 'msg': msg})
elif result:
response = templates.TemplateResponse('home.html', {'request': request, 'result': result})
else:
response = templates.TemplateResponse('home.html', {'request': request})
return response
#app.post('/', response_model=FormData, name='homepage_post')
async def post_main_data(request: Request,
file: FormData = Depends(FormData.as_form)):
if condition:
......
......
return RedirectResponse(request.url_for('homepage', **{'result': str(trans)}), status_code=status.HTTP_302_FOUND)
return RedirectResponse(request.url_for('homepage', **{'msg': str(err)}), status_code=status.HTTP_302_FOUND)
How do I send result or msg via RedirectResponse, url_for() to app.get()?
Is there a way to hide the data in the URL either as path parameter or query parameter? How do I achieve this?
I am getting the error starlette.routing.NoMatchFound: No route exists for name "homepage" and params "result". when trying this way.
Update:
I tried the below:
return RedirectResponse(app.url_path_for(name='homepage')
+ '?result=' + str(trans),
status_code=status.HTTP_303_SEE_OTHER)
The above works, but it works by sending the param as query param, i.e., the URL looks like this localhost:8000/?result=hello. Is there any way to do the same thing but without showing it in the URL?
For redirecting from a POST to a GET method, please have a look at this and this answer on how to do that and the reason for using status_code=status.HTTP_303_SEE_OTHER (example is given below).
As for the reason for getting starlette.routing.NoMatchFound error, this is because request.url_for() receives path parameters, not query parameters. Your msg and result parameters are query ones; hence, the error.
A solution would be to use a CustomURLProcessor, as suggested in this and this answer, allowing you to pass both path (if need to) and query parameters to the url_for() function and obtain the URL. As for hiding the path and/or query parameters from the URL, you can use a similar approach to this answer that uses history.pushState() (or history.replaceState()) to replace the URL in the browser's address bar.
Complete working example can be found below (you can use your own TemplateResponse in the place of HTMLResponse).
from fastapi import FastAPI, Request, status
from fastapi.responses import RedirectResponse, HTMLResponse
from typing import Optional
import urllib
app = FastAPI()
class CustomURLProcessor:
def __init__(self):
self.path = ""
self.request = None
def url_for(self, request: Request, name: str, **params: str):
self.path = request.url_for(name, **params)
self.request = request
return self
def include_query_params(self, **params: str):
parsed = list(urllib.parse.urlparse(self.path))
parsed[4] = urllib.parse.urlencode(params)
return urllib.parse.urlunparse(parsed)
#app.get('/', response_class=HTMLResponse)
def event_msg(request: Request, msg: Optional[str] = None):
if msg:
html_content = """
<html>
<head>
<script>
window.history.pushState('', '', "/");
</script>
</head>
<body>
<h1>""" + msg + """</h1>
</body>
</html>
"""
return HTMLResponse(content=html_content, status_code=200)
else:
html_content = """
<html>
<body>
<h1>Create an event</h1>
<form method="POST" action="/">
<input type="submit" value="Create Event">
</form>
</body>
</html>
"""
return HTMLResponse(content=html_content, status_code=200)
#app.post('/')
def event_create(request: Request):
redirect_url = CustomURLProcessor().url_for(request, 'event_msg').include_query_params(msg="Succesfully created!")
return RedirectResponse(redirect_url, status_code=status.HTTP_303_SEE_OTHER)
Update
Regarding adding query params to url_for(), another solution would be using Starlette's starlette.datastructures.URL, which now provides a method to include_query_params. Example:
from starlette.datastructures import URL
redirect_url = URL(request.url_for('event_msg')).include_query_params(msg="Succesfully created!")
return RedirectResponse(redirect_url, status_code=status.HTTP_303_SEE_OTHER)
For an application, I have followed the fastAPI documentation for the authentification process.
By default, OAuth2PasswordBearer raise an HTTPException with status code 401. So, I can't check if an user is actually connected without return a 401 error to the client.
An example of what I want to do:
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="api/users/token")
def get_current_user(token: str = Depends(oauth2_scheme)):
try:
settings = get_settings()
payload = jwt.decode(token, settings.secret_key,
algorithms=[settings.algorithm_hash])
email = payload.get("email")
if email is None:
raise credentials_exception
token_data = TokenData(email=email)
except jwt.JWTError:
raise credentials_exception
user = UserNode.get_node_with_email(token_data.email)
if user is None:
raise credentials_exception
return user
#app.get('/')
def is_connected(user = Depends(get_current_user)
# here, I can't do anything if the user is not connected,
# because an exception is raised in the OAuth2PasswordBearer __call__ method ...
return
I see OAuth2PasswordBearer class have an "auto_error" attribute, which controls if the function returns None or raises an error:
if not authorization or scheme.lower() != "bearer":
if self.auto_error:
raise HTTPException(
status_code=HTTP_401_UNAUTHORIZED,
detail="Not authenticated",
headers={"WWW-Authenticate": "Bearer"},
)
else:
return None
So i think about a workaround:
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="api/users/token", auto_error=False)
def get_current_user(token: str = Depends(oauth2_scheme)):
if not token:
return None
# [ ... same token decoding logic than before ... ]
return user
#app.get('/')
def is_connected(user = Depends(get_current_user)
return user
It works, but I wonder what other ways there are to do this, is there a more "official" method?
This is a good question and as far as I know, there isn't an "official" answer that is universally agreed upon.
The approach I've seen most often in the FastAPI applications that I've reviewed involves creating multiple dependencies for each use case.
While the code works similarly to the example you've provided, the key difference is that it attempts to parse the JWT every time - and doesn't only raise the credentials exception when it does not exist. Make sure the dependency accounts for malformed JWTs, invalid JWTs, etc.
Here's an example adapted to the general structure you've specified:
# ...other code
oauth2_scheme = OAuth2PasswordBearer(
tokenUrl="api/users/token",
auto_error=False
)
auth_service = AuthService() # service responsible for JWT management
async def get_user_from_token(
token: str = Depends(oauth2_scheme),
user_node: UserNode = Depends(get_user_node),
) -> Optional[User]:
try:
email = auth_service.get_email_from_token(
token=token,
secret_key=config.SECRET_KEY
)
user = await user_node.get_node_with_email(email)
return user
except Exception:
# exceptions may include no token, expired JWT, malformed JWT,
# or database errors - either way we ignore them and return None
return None
def get_current_user_required(
user: Optional[User] = Depends(get_user_from_token)
) -> Optional[User]:
if not user:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="An authenticated user is required for that action.",
headers={"WWW-Authenticate": "Bearer"},
)
return user
def get_current_user_optional(
user: Optional[User] = Depends(get_user_from_token)
) -> Optional[User]:
return user