how to solve InterfaceError when importing a variable file which has a depth-1 variable being a list? - airflow

I am importing a variable file (eg, variables.json) into airflow, which has one depth-1 variable being a list like this:
{...
"var1": ["value1", "value2"],
...
}
I tried 3 methods:
1). in command line: airflow variables -i variables.json
2). in airflow UI, admin -> Variables -> Choose file -> Import Variables
3). in airflow UI, admin -> Variables -> Create -> input key (ie, Var1) and value (ie, ["value1", "value2"]) respectively.
Method 1 and 2 failed, but 3 succeeded.
method 1 returns info like "15 of 27 variables successfully updated.", which means some variables are not successfully updated
method 2 shows error:
InterfaceError: (sqlite3.InterfaceError) Error binding parameter 1 - probably unsupported type. [SQL: u'INSERT INTO variable ("key", val, is_encrypted) VALUES (?, ?, ?)'] [parameters: (u'var1', [u'value1', u'value2'], 0)] (Background on this error at: http://sqlalche.me/e/rvf5)
I search and found this thread: InterfaceError:(sqlte3.InterfaceError)Error binding parameter 0.
It seems that sqlite does not support list type.
I also tested a case having nesting variable (here for example, var2_1) being list like this
{...
"var2": {"var2_1": ["A","B"]},
...
}
all of above 3 methods are working.
So my questions are:
(1) why method 1 and 2 failed, but 3 succeeded for depth-1 variable being a list?
(2) why nesting (depth-2,3,...) variable can be a list without any issue?

If you're running Airflow 1.10.3, the import_helper used in the CLI only serializes dict values to JSON.
def import_helper(filepath):
#...
for k, v in d.items():
if isinstance(v, dict):
Variable.set(k, v, serialize_json=True)
else:
Variable.set(k, v)
n += 1
except Exception:
pass
finally:
print("{} of {} variables successfully updated.".format(n, len(d)))
https://github.com/apache/airflow/blob/1.10.3/airflow/bin/cli.py#L376
The WebUI importer also does the same thing with dict values.
models.Variable.set(k, v, serialize_json=isinstance(v, dict))
https://github.com/apache/airflow/blob/1.10.3/airflow/www/views.py#L2073
However, current revision (1.10.4rc1) shows that non string values will be serialized to string in future releases in CLI import_helper
Variable.set(k, v, serialize_json=not isinstance(v, six.string_types))
https://github.com/apache/airflow/blob/1.10.4rc1/airflow/bin/cli.py
...and WebUI importer.
models.Variable.set(k, v, serialize_json=not isinstance(v, six.string_types))
https://github.com/apache/airflow/blob/1.10.4rc1/airflow/www/views.py#L2118
Currently, it will serve you to perform the serialization of non string values in your import process when you do it with the CLI or WebUI importer.
...and when you retrieve value for such variable pass the option to deserialize them e.g.
Variable.get('some-key', deserialize_json=True)

In your variable.json, ["value1", "value2"] is an array, where Airflow expects a value/string or a JSON.
It would work if you cast that array into a string in your JSON.

Related

fastapi + sqlalchemy + pydantic → how to read data return to schema

I'm trying to use FastApi, sqlalchemy and pydantic.
I have in request body, a schema, a field type list, and optional named files (files: list[schemas.ImageBase]).
I need to read all the entered data one by one but it doesn't let me loop for the returned list.
This also happens to me when I get a query returned using for example:
def get_setting(svalue:int, s_name: str):
db = SessionLocal()
query = db.query(models.Setting)\
.filter(
models.Setting.svalue == svalue,
models.Setting.appuser == s_name
).all()
return query
in the -->
async def get_settings(svalue: int, name:str):
**values**=crud.get_setting(svalue=svalue,s_name=name)
return {"settings" : values}
But I can't loop (with the for) **values**
Why? I have to set something or I'm wrong using the query or pydantic?
I expect to be looking for a list or dictionary and being able to read the data

Union of dict with typed keys not compatible with an empty dict

I'd like to type a dict as either a dictionary where the all the keys are integers or a dictionary where all the keys are strings.
However, when I read mypy (v0.991) on the following code:
from typing import Union, Any
special_dict: Union[dict[int, Any], dict[str, Any]]
special_dict = {}
I get an Incompatible types error.
test_dict_nothing.py:6: error: Incompatible types in assignment (expression has type "Dict[<nothing>, <nothing>]", variable has type "Union[Dict[int, Any], Dict[str, Any]]") [assignment]
Found 1 error in 1 file (checked 1 source file)
How do I express my typing intent.
This is a mypy bug, already reported here with priority 1. There is a simple workaround suggested:
from typing import Any
special_dict: dict[str, Any] | dict[int, Any] = dict[str, Any]()
This code typechecks successfully, because you basically give mypy a hint with more specific type of your dictionary. It may be any matching type and won't affect further checking, because you broaden the final type with an explicit type hint.

How to Get the Task Status and Result of a Celery Task Type When Multiple Tasks Are Defined?

Typically I send an asynchronous task with .apply_async method of the Promise defined, and then I use the taskid on the AsyncResult method of the same object to get task status, and eventually, result.
But this requires me to know the exact type of task when more than one tasks are defined in the same deployment. Is there any way to circumvent this, when I can know the task status and result (if available) without knowing the exact task?
For example, take this example celery master node code.
#!/usr/bin/env python3
# encoding:utf-8
"""Define the tasks in this file."""
from celery import Celery
redis_host: str = 'redis://localhost:6379/0'
celery = Celery(main='test', broker=redis_host,
backend=redis_host)
celery.conf.CELERY_TASK_SERIALIZER = 'pickle'
celery.conf.CELERY_RESULT_SERIALIZER = 'pickle'
celery.conf.CELERY_ACCEPT_CONTENT = {'json', 'pickle'}
# pylint: disable=unused-argument
#celery.task(bind=True)
def add(self, x: float, y: float) -> float:
"""Add two numbers."""
return x + y
#celery.task(bind=True)
def multiply(self, x: float, y: float) -> float:
"""Multiply two numbers."""
return x * y
When I call something like this in a different module
task1=add.apply_async(args=[2, 3]).id
task2=multiply.apply_async(args=[2, 3]).id
I get two uuids for the tasks. But when checking back the task status, I need to know which method (add or multiply) is associated with that task id, since I have to call the method on the corresponding object, like this.
status: str = add.AsyncResult(task_id=task1).state
My question is how can I fetch the state and result armed only with the task id without knowing whether the task belongs add, multiply or any other category defined.
id and state are just properties of the AsyncResult objects. If you looked at documentation for the AsyncResult class, you would find the name property which is exactly what you are asking for.

Map function and iterables in python 3.6

I've inherited a piece of code that I need to run somewhere else than the original place with some minor changes. I am trying to map a list of strings to something that applies a function to each element of that list using python 3.6 (a language I am not familiar with).
I would like to use map not list comprehension, but now I doubt this is possible.
In the following example I've tried a combination of for loops, yield (or not), and next(...) or not, but I am not able to make the code working as expected.
I would like to see the print:
AAA! xxx
Found: foo
Found: bar
each time the counter xxx modulo 360 is 0 (zero).
I understand the map function does not execute the code, so then I need to do something to "apply" that function to each element of the input list.
However I am not able to make this thing work. This documentation https://docs.python.org/3.6/library/functions.html#map and https://docs.python.org/3.6/howto/functional.html#iterators do not help that much, I went through it and I think at least one of the commented bits below (# <python code>) should have worked. I am not an experienced python developer and I think I am missing some gotchas about the syntax/conventions of python 3.6 regarding iterators/generators.
issue_counter = 0
def foo_func(serious_stuff):
# this is actually a call to a module to send an email with the "serious_stuff"
print("Found: {}".format(serious_stuff))
def report_issue():
global issue_counter
# this actually executes once per minute (removed the logic to run this fast)
while True:
issue_counter += 1
# every 6 hours (i.e. 360 minutes) I would like to send emails
if issue_counter % 360 == 0:
print("AAA! {}".format(issue_counter))
# for stuff in map(foo_func, ["foo", "bar"]):
# yield stuff
# stuff()
# print(stuff)
iterable_stuff = map(foo_func, ["foo", "bar"])
for stuff in next(iterable_stuff):
# yield stuff
print(stuff)
report_issue()
I get lots of different errors/unexpected behaviors of that for loop when running the script:
not printing anything when I call print(...)
TypeError: 'NoneType' object is not callable
AttributeError: 'map' object has no attribute 'next'
TypeError: 'NoneType' object is not iterable
Printing what I am expecting interleaved by None, e.g.:
AAA! 3047040
Found: foo
None
Found: bar
None
I found out the call to next(iterable_thingy) actually invokes the mapped function.
Knowing the length of the input list when mapping it to generate the iterable, means we know how many times we have to invoke the next(iterable_thingy), so the function report_issue (in my previous example) runs as expected when defined like this:
def report_issue():
global issue_counter
original_data = ["foo", "bar"]
# this executes once per minute
while True:
issue_counter += 1
# every 6 hours I would like to send emails
if issue_counter % 360 == 0:
print("AAA! {}".format(issue_counter))
iterable_stuff = map(foo_func, original_data)
for idx in range(len(original_data)):
next(iterable_stuff)
To troubleshoot this iterable stuff I found useful running ipython (an interactive REPL) to check the type and documentation of the generated iterable, like this:
In [2]: def foo_func(serious_stuff):
...: # this is actually a call to a module to send an email with the "serious_stuff"
...: print("Found: {}".format(serious_stuff)) ...:
In [3]: iterable_stuff = map(foo_func, ["foo", "bar"])
In [4]: iterable_stuff?
Type: map
String form: <map object at 0x7fcdbe8647b8>
Docstring:
map(func, *iterables) --> map object
Make an iterator that computes the function using arguments from
each of the iterables. Stops when the shortest iterable is exhausted.
In [5]: next(iterable_stuff) Found: foo
In [6]: bar_item = next(iterable_stuff) Found: bar
In [7]: bar_item?
Type: NoneType
String form: None
Docstring: <no docstring>
In [8]:

Erlang: How to create a function that returns a string containing the date in YYMMDD format?

I am trying to learn Erlang and I am working on the practice problems Erlang has on the site. One of them is:
Write the function time:swedish_date() which returns a string containing the date in swedish YYMMDD format:
time:swedish_date()
"080901"
My function:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr((integer_to_list(YYYY, 3,4)++pad_string(integer_to_list(MM))++pad_string(integer_to_list(DD)).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
I'm getting the following errors when compiled.
demo.erl:6: syntax error before: '.'
demo.erl:2: function swedish_date/0 undefined
demo.erl:9: Warning: function pad_string/1 is unused
error
How do I fix this?
After fixing your compilation errors, you're still facing runtime errors. Since you're trying to learn Erlang, it's instructive to look at your approach and see if it can be improved, and fix those runtime errors along the way.
First let's look at swedish_date/0:
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
Why convert the list to a tuple? Since you use the list elements individually and never use the list as a whole, the conversion serves no purpose. You can instead just pattern-match the returned tuple:
{YYYY,MM,DD} = date(),
Next, you're calling string:substr/1, which doesn't exist:
string:substr((integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))).
The string:substr/2,3 functions both take a starting position, and the 3-arity version also takes a length. You don't need either, and can avoid string:substr entirely and instead just return the assembled string:
integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Whoops, this is still not right: there is no such function integer_to_list/3, so just replace that first call with integer_to_list/1:
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Next, let's look at pad_string/1:
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
There's a runtime error here because '0' is an atom and you're attempting to append String, which is a list, to it. The error looks like this:
** exception error: bad argument
in operator ++/2
called as '0' ++ "8"
Instead of just fixing that directly, let's consider what pad_string/1 does: it adds a leading 0 character if the string is a single digit. Instead of using if to check for this condition — if isn't used that often in Erlang code — use pattern matching:
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
The first clause matches a single-element list, and returns a new list with the element D preceded with $0, which is the character constant for the character 0. The second clause matches all other arguments and just returns whatever is passed in.
Here's the full version with all changes:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
{YYYY,MM,DD} = date(),
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
But a simpler approach would be to use the io_lib:format/2 function to just format the desired string directly:
swedish_date() ->
io_lib:format("~w~2..0w~2..0w", tuple_to_list(date())).
First, note that we're back to calling tuple_to_list(date()). This is because the second argument for io_lib:format/2 must be a list. Its first argument is a format string, which in our case says to expect three arguments, formatting each as an Erlang term, and formatting the 2nd and 3rd arguments with a width of 2 and 0-padded.
But there's still one more step to address, because if we run the io_lib:format/2 version we get:
1> demo:swedish_date().
["2015",["0",56],"29"]
Whoa, what's that? It's simply a deep list, where each element of the list is itself a list. To get the format we want, we can flatten that list:
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
Executing this version gives us what we want:
2> demo:swedish_date().
"20150829"
Find the final full version of the code below.
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
UPDATE: #Pascal comments that the year should be printed as 2 digits rather than 4. We can achieve this by passing the date list through a list comprehension:
swedish_date() ->
DateVals = [D rem 100 || D <- tuple_to_list(date())],
lists:flatten(io_lib:format("~w~2..0w~2..0w", DateVals)).
This applies the rem remainder operator to each of the list elements returned by tuple_to_list(date()). The operation is needless for month and day but I think it's cleaner than extracting the year and processing it individually. The result:
3> demo:swedish_date().
"150829"
There are a few issues here:
You are missing a parenthesis at the end of line 6.
You are trying to call integer_to_list/3 when Erlang only defines integer_to_list/1,2.
This will work:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr(
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))
).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
In addition to the parenthesis error on line 6, you also have an error on line 10 where yo use the form '0' instead of "0", so you define an atom rather than a string.
I understand you are doing this for educational purpose, but I encourage you to dig into erlang libraries, it is something you will have to do. For a common problem like this, it already exists function that help you:
swedish_date() ->
{YYYY,MM,DD} = date(), % not useful to transform into list
lists:flatten(io_lib:format("~2.10.0B~2.10.0B~2.10.0B",[YYYY rem 100,MM,DD])).
% ~X.Y.ZB means: uses format integer in base Y, print X characters, uses Z for padding

Resources