Type annotation for Callables with *args - python-3.6

I know how to type-annotate functions which take arguments.
def function(text: str, *args: int) -> None:
print(text)
for arg in args:
print(arg)
My problem, however, is that I don't know what this looks like when another function takes it as an input:
def introduce(foo: Callable[[str, XXX], None], text: str, *args: int) -> None:
print("Enter...")
foo(text, *args)
print("Exit...")
introduce(function, "Hello, World!", (1, 2, 3))
Naturally, the XXX represents whatever it is that *args is supposed to be. However, I can't figure that out, and I've been unable to intuit what belongs there. I don't think it would be int, as that would suggest exactly one proceeding value. Experimentation has failed me here.

Related

How to negate Airflow sensor task result?

Is there a built-in facility or some operator that will run a sensor and negate its status? I am writing a workflow that needs to detect that an object does not exist in order to proceed to eventual success. I have a sensor, but it detects when the object does exist.
For instance, I would like my workflow to detect that an object does not exist. I need almost exactly S3KeySensor, except that I need to negate its status.
The use case you are describing is checking key in S3, if exist wait otherwise continue workflow. As you mentioned this is a Sensor use case. The S3Hook has function check_for_key that checks if key exist so all needed is just to wrap it with Sensor poke function..
A simple basic implementation would be:
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
from airflow.sensors.base import BaseSensorOperator
class S3KeyNotPresentSensor(BaseSensorOperator):
""" Waits for a key to not be present in S3. """
template_fields: Sequence[str] = ('bucket_key', 'bucket_name')
def __init__(
self,
*,
bucket_key: str,
bucket_name: Optional[str] = None,
aws_conn_id: str = 'aws_default',
verify: Optional[Union[str, bool]] = None,
**kwargs,
):
super().__init__(**kwargs)
self.bucket_name = bucket_name
self.bucket_key = [bucket_key] if isinstance(bucket_key, str) else bucket_key
self.aws_conn_id = aws_conn_id
self.verify = verify
self.hook: Optional[S3Hook] = None
def poke(self, context: 'Context'):
return not self.get_hook().check_for_key(self.bucket_key, self.bucket_name)
def get_hook(self) -> S3Hook:
"""Create and return an S3Hook"""
if self.hook:
return self.hook
self.hook = S3Hook(aws_conn_id=self.aws_conn_id, verify=self.verify)
return self.hook
I ended up going another way. I can use the trigger_rule argument of (any) Task -- by setting it to one_failed or all_failed on the next task I can play around with the desired status.
For example,
file_exists = FileSensor(task_id='exists', timeout=3, poke_interval=1, filepath='/tmp/error', mode='reschedule')
sing = SmoothOperator(task_id='sing', trigger_rule='all_failed')
file_exists >> sing
It requires no added code or operator, but has the possible disadvantage of being somewhat surprising.
Replying to myself in the hope that this may be useful to someone else. Thanks!

Why should I call a BERT module instance rather than the forward method?

I'm trying to extract vector-representations of text using BERT in the transformers libray, and have stumbled on the following part of the documentation for the "BERTModel" class:
Can anybody explain this in more detail? A forward-pass makes intuitive sense to me (am trying to get final hidden states after all), and I can't find any additional information on what "pre and post processing" means in this context.
Thanks up front!
I think this is just general advice concerning working with PyTorch Module's. The transformers modules are nn.Modules, and they require a forward method. However, one should not call model.forward() manually but instead call model(). The reason is that PyTorch does some stuff under the hood when just calling the Module. You can find that in the source code.
def __call__(self, *input, **kwargs):
for hook in self._forward_pre_hooks.values():
result = hook(self, input)
if result is not None:
if not isinstance(result, tuple):
result = (result,)
input = result
if torch._C._get_tracing_state():
result = self._slow_forward(*input, **kwargs)
else:
result = self.forward(*input, **kwargs)
for hook in self._forward_hooks.values():
hook_result = hook(self, input, result)
if hook_result is not None:
result = hook_result
if len(self._backward_hooks) > 0:
var = result
while not isinstance(var, torch.Tensor):
if isinstance(var, dict):
var = next((v for v in var.values() if isinstance(v, torch.Tensor)))
else:
var = var[0]
grad_fn = var.grad_fn
if grad_fn is not None:
for hook in self._backward_hooks.values():
wrapper = functools.partial(hook, self)
functools.update_wrapper(wrapper, hook)
grad_fn.register_hook(wrapper)
return result
You'll see that forward is called when necessary.

How to get the correct signatures order of annotations in methods when performing overriding

I am trying to fix some methods annotations on magic and normal methods. For example, I have some cases like:
```
class Headers(typing.Mapping[str, str]):
...
def __contains__(self, key: str) -> bool:
...
return False
def keys(self) -> typing.List[str]:
...
return ['a', 'b']
```
and when I run mypy somefile.py --disallow-untyped-defs I have the following errors:
error: Argument 1 of "__contains__" incompatible with supertype "Mapping"
error: Argument 1 of "__contains__" incompatible with supertype "Container"
error: Return type of "keys" incompatible with supertype "Mapping"
What I understand is that I need to override the methods using the #override decorator and I need to respect the order of inheritance. Is it correct?
If my assumption is correct, Is there any place in which I can find the exact signatures of the parent classes?
After asking the question on mypy, the answer was:
Subclassing typing.Mapping[str, str], I'd assume that the function
signature for the argument key in contains ought to match the
generic type?
contains isn't a generic method -- it's defined to have the type signature contains(self, key: object) -> bool. You can check this on typeshed. The reason why contains is defined this way is because doing things like 1 in {"foo": "bar"} is technically legal.
Subclassing def contains(self, key) to def contains(self, key:
str) is in any case more specific. A more specific subtype doesn't
violate Liskov, no?
When you're overriding a function, it's ok to make the argument types more general and the return types more specific. That is, the argument types should be contravariant and the return types covariant.
If we did not follow the rule, we could end up introducing bugs in our code. For example:
class Parent:
def foo(self, x: object) -> None: ...
class Child(Parent):
def foo(self, x: str) -> None: ...
def test(x: Parent) -> None:
x.foo(300) # Safe if 'x' is actually a Parent, not safe if `x` is actually a Child.
test(Child())
Because we broke liskov, passing in an instance of Child into test ended up introducing a bug.
Basically if I use Any for key on __contains__ method is correct and mypy won't complaint :
def __contains__(self, key: typing.Any) -> bool:
...
return False
You can follow the conversation here

Indentation Error in Python 3.6.1 def

I had been having issues with python 3.7 for quite some time about very pointless indentations, so I decided to get back to 3.6, specifically repl.it Python 3.6.1, and as I mentioned, the errors are for no good reason whatsoever as far as I can tell, the code is as written below:
from random import randint
import functools
printf = functools.partial(print, end=" ")
defNuc = ['C','A','T','G']
def opNuc():
def create():
nuc = [0]
nucop = [0]
length = randint(11,16)
print (length - 1)
for i in range(1,length):
part = randint(1,4)
for a in range(1,4)
if part == a:
nuc = defNuc[a]
nucOp = defNuc[-a]
if i != length - 1:
printf(nuc[i],i,"-")
else:
print(nuc[i],i)
for i in range (1,length):
if i != length - 1:
printf(nucOp[i],"-")
else:
print(nucop[i])
The error is at line 9, at
def create():
and as for the reason of error, it just says
expected an indented block
Edit:
This was completely my stupidity, don't take the post seriously, will be deleted in 10 minutes.
You never finished the definition of opNuc, so the parser is expecting an indented line to continue the body of that function. Either add a pass statement to provide a trivial body:
def opNuc():
pass
or indent the definition of create if that is supposed to be local to the body of opNuc (unlikely, but possible):
def opNuc():
def create():
...
The problem is that your first function, opNuc, was never finished. I have made this simple mistake many times myself and is very easy to miss. It's easy to fix though, just type pass inside of the opNuc function and it should be fine. Hope I helped!

Map function and iterables in python 3.6

I've inherited a piece of code that I need to run somewhere else than the original place with some minor changes. I am trying to map a list of strings to something that applies a function to each element of that list using python 3.6 (a language I am not familiar with).
I would like to use map not list comprehension, but now I doubt this is possible.
In the following example I've tried a combination of for loops, yield (or not), and next(...) or not, but I am not able to make the code working as expected.
I would like to see the print:
AAA! xxx
Found: foo
Found: bar
each time the counter xxx modulo 360 is 0 (zero).
I understand the map function does not execute the code, so then I need to do something to "apply" that function to each element of the input list.
However I am not able to make this thing work. This documentation https://docs.python.org/3.6/library/functions.html#map and https://docs.python.org/3.6/howto/functional.html#iterators do not help that much, I went through it and I think at least one of the commented bits below (# <python code>) should have worked. I am not an experienced python developer and I think I am missing some gotchas about the syntax/conventions of python 3.6 regarding iterators/generators.
issue_counter = 0
def foo_func(serious_stuff):
# this is actually a call to a module to send an email with the "serious_stuff"
print("Found: {}".format(serious_stuff))
def report_issue():
global issue_counter
# this actually executes once per minute (removed the logic to run this fast)
while True:
issue_counter += 1
# every 6 hours (i.e. 360 minutes) I would like to send emails
if issue_counter % 360 == 0:
print("AAA! {}".format(issue_counter))
# for stuff in map(foo_func, ["foo", "bar"]):
# yield stuff
# stuff()
# print(stuff)
iterable_stuff = map(foo_func, ["foo", "bar"])
for stuff in next(iterable_stuff):
# yield stuff
print(stuff)
report_issue()
I get lots of different errors/unexpected behaviors of that for loop when running the script:
not printing anything when I call print(...)
TypeError: 'NoneType' object is not callable
AttributeError: 'map' object has no attribute 'next'
TypeError: 'NoneType' object is not iterable
Printing what I am expecting interleaved by None, e.g.:
AAA! 3047040
Found: foo
None
Found: bar
None
I found out the call to next(iterable_thingy) actually invokes the mapped function.
Knowing the length of the input list when mapping it to generate the iterable, means we know how many times we have to invoke the next(iterable_thingy), so the function report_issue (in my previous example) runs as expected when defined like this:
def report_issue():
global issue_counter
original_data = ["foo", "bar"]
# this executes once per minute
while True:
issue_counter += 1
# every 6 hours I would like to send emails
if issue_counter % 360 == 0:
print("AAA! {}".format(issue_counter))
iterable_stuff = map(foo_func, original_data)
for idx in range(len(original_data)):
next(iterable_stuff)
To troubleshoot this iterable stuff I found useful running ipython (an interactive REPL) to check the type and documentation of the generated iterable, like this:
In [2]: def foo_func(serious_stuff):
...: # this is actually a call to a module to send an email with the "serious_stuff"
...: print("Found: {}".format(serious_stuff)) ...:
In [3]: iterable_stuff = map(foo_func, ["foo", "bar"])
In [4]: iterable_stuff?
Type: map
String form: <map object at 0x7fcdbe8647b8>
Docstring:
map(func, *iterables) --> map object
Make an iterator that computes the function using arguments from
each of the iterables. Stops when the shortest iterable is exhausted.
In [5]: next(iterable_stuff) Found: foo
In [6]: bar_item = next(iterable_stuff) Found: bar
In [7]: bar_item?
Type: NoneType
String form: None
Docstring: <no docstring>
In [8]:

Resources