Adding params as a template_fields in BigQueryOperator - airflow

I am trying to template the params field in a Bigquery operator as below.
t3 = MyBigQueryOperator(
task_id='autoplay_calc',
bql='autoplay_calc.sql',
params={
"env" : deployment
,"region" : region
,"partition_start_date" : '{{ macros.ds_add(ds, -1) }}'
},
bigquery_conn_id='gcp_conn',
use_legacy_sql=False,
write_disposition='WRITE_APPEND',
allow_large_results=True,
provide_context=True,
destination_dataset_table=reporting_project + '.pa_reporting_public_batch.autoplay_calc',
dag=dag
)
I realise that params is not a templated field hence I extended the Bigqueryoperator as below to make this a templated field.
class MyBigQueryOperator(BigQueryOperator):
template_fields = ('params', 'bql', 'destination_dataset_table')
However when I run the code seems it's not converting the params field as I receive the bellow error message
Could not cast literal "{{ macros.ds_add(ds, -1) }}

Short answer: params does not support templating as it's a dictionary and it would require applying jinja2 to key-value pairs. You cannot add the support by just extending the template_fields attribute.

So, the problem is that you can't simply add 'params' at the beginning of your 'template_fields', as in Airflow the task_instance doing the rendering will use the 'params' in the 'context' dictionary instead of the one you just rendered.
You have multiples ways to go around that
Overriding context
I think the class speaks for itself, I've kept the original docstring for a better explanation of the problem.
class ExtendedBigQueryOperator(BigQueryOperator):
"""
For parameters in 'params' containing macros, those macros are not templated because 'params' is not templated.
Example:
operator = ExtendedBigQueryOperator(
task_id='test',
sql="SELECT {{ params.columns }} FROM {{ params.bq_table_id }}",
params={
'columns': "*",
'bq_table_id': "project_id.dataset.id.table_id${{ds_nodash}}"}
)
Here, 'columns' does not contains macros and will be correctly templated, but 'bq_table_id' will be templated as
'project_id.dataset.id.table_id${{ds_nodash}}' instead of 'project_id.dataset.id.table_id$20200101'
(if ds was 2020-01-01).
Just making 'params' a template_fields won't work, because the sql is templated using the params in the 'context'
dictionary and not the new templated one. We need to render params and add it to the 'context' dictionary in the
'render_template_fields' method.
"""
def render_template_fields(self, context: Dict, jinja_env: Optional[jinja2.Environment] = None) -> None:
""" Add the rendered 'params' to the context dictionary before running the templating """
# Like the original method, get the env if not provided
if not jinja_env:
jinja_env = self.get_template_env()
# Run the render template on params and add it to the context
if self.params:
context['params'] = self.render_template(self.params, context, jinja_env, set())
# Call the original method
super().render_template_fields(context=context, jinja_env=jinja_env)
Using .format()
This solution should not be chosen, is kept here as it was what I used originally.
A quick and dirty workaround I use is to create a custom class that will run a .format on the sql text:
class ExtendedBigQueryOperator(BigQueryOperator):
"""
For parameters in 'params' containing macros, the macros are not templated because 'params' is not templated.
Just making 'params' a template_fields won't work either, because the sql is templated using the parent params and not the new templated one (or something along this lines)
So instead of using the templating from Jinja, we use the 'format' string method that will be executed at the pre_execute stage.
This means that you can use params with macros in your sql, but using the 'format' format, in single brackets without using the 'params.' prefix.
For example, instead of {{params.process_table}}, you would use {process_table}
Note: I always use single brackets even for params without macro for consistency, but it is not necessary.
"""
template_fields = ["params", *BigQueryOperator.template_fields]
def pre_execute(self, context):
self.sql = self.sql.format(**self.params)
Your SQL will then be exactly the same, except every variables from params should be single quoted instead of double quoted (airflow macros should be passed as arguments) and you need to remove the 'params.' prefix.

Related

Pass additional parameters to django_table2 TemplateColumn

In my django project I have a lot of tables which return models. The last column is mostly an Action-Column where users may edit or delete an instance. How do I proceed if I want to pass additional arguments to TemplateColumn if in some tables I want an edit and delete button and in other tables I only need an edit and info button? I want to use the same template.html but with conditions in it. Here what I have in Table:
import django_tables2 as tables
from select_tool.models import DefactoCapability
class DefactoCapabilityTable(tables.Table):
my_column = tables.TemplateColumn(verbose_name='Actions', template_name='core/actionColumnTable.html')
class Meta:
model = DefactoCapability
template_name = 'django_tables2/bootstrap-responsive.html'
attrs = {'class': 'table table-xss table-hover'}
exclude = ( 'body', )
orderable = False
And how do I check perms on the actions in order to display the button or not?
Quoting the TemplateColumn docs
A Template object is created [...] and rendered with a context containing:
record – data record for the current row
value – value from record that corresponds to the current column
default – appropriate default value to use as fallback
row_counter – The number of the row this cell is being rendered in.
any context variables passed using the extra_context argument to TemplateColumn.
So you could do something like this:
my_column = tables.TemplateColumn(
template_name='core/actionColumnTable.html',
extra_context={
'edit_button': True,
}
)
The context also contains the complete context of the template from where {% render_table %} is called. So if you have 'django.template.context_processors.request' in your context_processors, you can access the current user using {{ request.user }}.

Passing arguments to an embedded controller in Symfony 2.8

I'm using Symfony 2.8.0 (as I find Symfony 3.x not very mature at the moment, but let's not go into that discussion right now).
According to the official documentation
(http://symfony.com/doc/2.8/book/templating.html#embedding-controllers)
it should be possible to pass arguments to an embedded controller invoked from within a view.
However, this doesn't seem to work. I always end up with the following exception:
"Controller "AppBundle\Controller\DefaultController::buildNavigationAction()" requires that you provide a value for the "$argument1" argument (because there is no default value or because there is a non optional argument after this one)."
Within my view I have the following bit of code:
{{ render(controller('AppBundle:Default:buildNavigation'), {
'argument1': 25,
'argument2': 50
}) }}
The controller looks like this:
public function buildNavigationAction($argument1, $argument2)
{
// ... some logic ...
return $this->render(
'navigation.html.twig', array(
'foo' => $argument1,
'bar' => $argument2
)
);
}
What gives? Is this a bug?
The use case described in the documentation (rendering dynamic content from within the base template and therefor on every page) is exactly what I'm using it for. Repeating the same logic in every single controller is an obvious sin against the DRY principle.
Your syntax is incorrect, as you are not passing values to the controller since you are closing the ) too early. It should instead be:
{{ render(controller('AppBundle:Default:buildNavigation', {
'argument1': 25,
'argument2': 50
})) }}

How to get the table name in AWS dynamodb trigger function?

I am new with AWS and working on creating a lambda function on Python. The function will get the dynamodb table stream and write to a file in s3. Here the name of the file should be the name of the table.
Can someone please tell me how to get the table name if the trigger that is invoking the lambda function?
Thanks for help.
Since you mentioned you are new to AWS, I am going to answer descriptively.
I am assuming that you have set 'Stream enabled' setting for your DynamoDB table to 'Yes', and have set up this as an event source to your lambda function.
This is how I got the table name from the stream that invoked my lambda function -
def lambda_handler(event, context):
print(json.dumps(event, indent=2)) # Shows what's in the event object
for record in event['Records']:
ddbARN = record['eventSourceARN']
ddbTable = ddbARN.split(':')[5].split('/')[1]
print("DynamoDB table name: " + ddbTable)
return 'Successfully processed records.'
Basically, the event object that contains all the information about a particular DynamoDB stream that was responsible for that particular lambda function invoke, contains a parameter eventSourceARN. This eventSourceARN is the ARN (Amazon Resource Number) that uniquely identifies your DynamoDB table from which the event occurred.
This is a sample value for eventSourceARN -
arn:aws:dynamodb:us-east-1:111111111111:table/test/stream/2020-10-10T08:18:22.385
Notice the bold text above - test; this is the table name you are looking for.
In the line ddbTable = ddbARN.split(':')[5].split('/')[1] above, I have tried to split the entire ARN by ':' first, and then by '/' in order to get the value test. Once you have this value, you can call S3 APIs to write to a file in S3 with the same name.
Hope this helps.
Please note that eventSourceArn is not always provided. From my testing today, I didn't see eventSourceArn presented in record. You can also refer to the links:
Issue: https://github.com/aws/aws-sdk-js/issues/2226
API: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_streams_Record.html
One way to do it will be via pattern matching in Scala using regex:
val ddbArnRegex: Regex = """arn:aws:dynamodb:(.+):(.+):table/(.+)/stream/(.+)""".r
def parseTableName(ddbARN: String): Option[String] = {
if (null == ddbARN) None
ddbARN match {
case ddbArnRegex(_, _, table, _) => Some(table)
case _ => None
}
}

Include "Change Note" when creating content from InvokeFactory

I am creating a content item from a PloneFormGen Form Custom Script Adapter using invokeFactory. Everything is working fine so far, however we want to start generating a comment to be included in the create action, for the history of the item. The comment itself will be generated using fields from the form and some preset text.
Is this something that would be possible from PFG?
The content type is a custom type, and it is versionable. Using Plone 4.3.2, PFG 1.7.14
EDIT
My current code:
from Products.CMFPlone.utils import normalizeString
portal_root = context.portal_url.getPortalObject()
target = portal_root['first-folder']['my-folder']
form = request.form
title = "My Title: "+form['title-1']
id = normalizeString(title)
id = id+"_"+str(DateTime().millis())
target.invokeFactory(
"MyCustomType",
id=id,
title=title,
text=form['comments'],
relatedItems=form['uid']
)
I have tried using keys like comments, comment, message, and even cmfeditions_version_comment within the target.invokeFactory arguments. No luck so far.
I'm not sure if that's possible in a custom script adapter.
The action of you first entry is None. The history automatically shows Create if the action is None. This is implemented here (plone.app.layout.viewlets.content)
# On a default Plone site you got the following
>>> item.workflow_history
{'simple_publication_workflow': ({'action': None, 'review_state': 'private', 'actor': 'admin', 'comments': '', 'time': DateTime('2014/10/02 08:08:53.659345 GMT+2')},)}
Key of the the dict is the workflow id and the value is a tuple of all entries.
So you can manipulate the entry like you want. But I don't know if this is possible with restricted python (custom script adapter can only use restricted python).
But you could also add a new entry, by extending you script with:
...
new_object = target.get(id)
workflow_tool = getToolByName(new_object, 'portal_workflow')
workflows = workflow_tool.getWorkflowsFor(new_object)
if not workflows:
return
workflow_id = workflows[0].id # Grap first workflow, if you have more, take the the one you need
review_state = workflow_tool.getInfoFor(new_object, 'review_state', None)
history_entry = {
'action' : action, # Your action
'review_state' : review_state,
'comments' : comment, # Your comment
'actor' : actor, # Probably you could get the logged in user
'time' : time,
}
workflow_tool.setStatusOf(workflow_id, context, history_entry)

what is #params in Iron:router

with meteor's IronRouter, I'm trying to use the this.params object elsewhere, but confused as to what it is. It seems to be a zero length array, that is actually an object with named methods after the path components.
# coffee
#route 'magnets',
path: '/magnets/lesson/:lessonCname'
data: ->
if #ready()
debugger;
console.log("route.params", #params)
with this code, in the debug console I will get:
this.params
[]
this.params.lessonCname
"despite-magnets-01"
typeof(this.params)
"object"
this.params.length
0
this.ready()
but in passing the params object to a server method, the methods (ie "lessonCname") disappear.
If my understanding is correct, then the near-term question is what is the best way to retrieve/convert these methods to {property:value} so they can be serialized and passed to server calls?
There are two easy ways of solving your problem, you can either set a global variable from within the data scope (but this is considered bad practice, at least IMO) or you can use the "data" function, which returns the data context for the current template:
data: ->
window._globalscopedata = #params.whatever #setting global variable
return someCollection.findOne #returns data context
_id: #params.whatever
when proccessing this route I will have the whatever param available in _globalscoredata and my document available in the template context.
Take a look at the source code for retrieving the parameters from a path. params is an array, but may have named properties. To iterate over everything, you can use the for in loop:
for(var x in myArray){
// Do something.
}
In this way, you can copy over everything to a new object (there may be a simpler way to create a copy).
The params property attached to a RouteController is an object with the following properties :
hash : the value of the URL hash.
query : an object consisting of key/value pairs representing the query string.
a list of URL fragments with their name and actual value.
Let's take an example, for this route definition :
// using iron:router#1.0.0-pre2 new route definition
Router.route("/posts/:slug");
And this URL typed in the browser address bar : /posts/first-post#comments?lang=en
We can use the console to find out precisely what params will actually contain :
> Router.current().params
Which will display this result :
Object {
hash: "comments",
slug: "first-post",
query: {
lang: "en"
}
}
Here slug is already a property of the params object whose value is "first-post", this is not a method.
If you want to extract from params these URL fragments as an object of key/value pairs, you can use underscore omit :
// getting rid of the hash and the query string
var parameters=_.omit(this.params,["hash","query"]);

Resources