Argument "xyz" to "ABC" has incompatible type "Tuple[None, ...]"; expected "Tuple[None]" - mypy

As an experiment, I wanted to add type annotations to my project and test it with mypy --strict. Consider the following code and the error message below:
#!/usr/bin/env python
import typing as T
from dataclasses import dataclass
#dataclass(frozen=True)
class Question:
choices: T.Tuple[None]
def gen_question() -> Question:
choices = [None]
return Question(choices=tuple(choices))
if __name__ == '__main__':
gen_question()
Here's the error message:
test.py:18: error: Argument "choices" to "Question" has incompatible type "Tuple[None, ...]"; expected "Tuple[None]"
Is there something I'm doing wrong, or is that a bug? How can I solve the problem?

It appears that in case of typing.Tuple, according to the documentation, if I need to specify a variable-length tuple I need to add , ... as in the following:
choices: T.Tuple[None, ...]
Note that this doesn't seem to apply to lists.

Related

Dagster: Multiple and Conditional Outputs (Type check failed for step output xxx PySparkDataFrame)

I'm executing the Dagster tutorial, and I got stuck at the Multiple and Conditional Outputs step.
In the solid definitions, it asks to declare (among other things):
output_defs=[
OutputDefinition(
name="hot_cereals", dagster_type=DataFrame, is_required=False
),
OutputDefinition(
name="cold_cereals", dagster_type=DataFrame, is_required=False
),
],
But there's no information where the DataFrame cames from.
Firstly I have tried with pandas.DataFrame but I faced the error: {dagster_type} is not a valid dagster type. It happens when I try to submit it via $ dagit -f multiple_outputs.py.
Then I installed the dagster_pyspark and gave a try with the dagster_pyspark.DataFrame. This time I managed to summit the DAG to the UI. However, when I run it from the UI, I got the following error:
dagster.core.errors.DagsterTypeCheckDidNotPass: Type check failed for step output hot_cereals of type PySparkDataFrame.
File "/Users/bambrozio/.local/share/virtualenvs/dagster-tutorial/lib/python3.7/site-packages/dagster/core/execution/plan/execute_plan.py", line 210, in _dagster_event_sequence_for_step
for step_event in check.generator(step_events):
File "/Users/bambrozio/.local/share/virtualenvs/dagster-tutorial/lib/python3.7/site-packages/dagster/core/execution/plan/execute_step.py", line 273, in core_dagster_event_sequence_for_step
for evt in _create_step_events_for_output(step_context, user_event):
File "/Users/bambrozio/.local/share/virtualenvs/dagster-tutorial/lib/python3.7/site-packages/dagster/core/execution/plan/execute_step.py", line 298, in _create_step_events_for_output
for output_event in _type_checked_step_output_event_sequence(step_context, output):
File "/Users/bambrozio/.local/share/virtualenvs/dagster-tutorial/lib/python3.7/site-packages/dagster/core/execution/plan/execute_step.py", line 221, in _type_checked_step_output_event_sequence
dagster_type=step_output.dagster_type,
Does anyone know how to fix it? Thanks for the help!
As Arthur pointed out, the full tutorial code is available on Dagster's github.
However, you do not need dagster_pandas, rather, the key lines missing from your code are:
if typing.TYPE_CHECKING:
DataFrame = list
else:
DataFrame = PythonObjectDagsterType(list, name="DataFrame") # type: Any
The reason for the above structure is to achieve MyPy compliance, see the Types & Expectations section of the tutorial.
See also the documentation on Dagster types.
I was stuck here, too, but luckily I found the updated source code.
They have updated the docs so that the OutputDefinition is defined beforehand.
Update your code before sorting and pipeline like below:
import csv
import os
from dagster import (
Bool,
Field,
Output,
OutputDefinition,
execute_pipeline,
pipeline,
solid,
)
#solid
def read_csv(context, csv_path):
lines = []
csv_path = os.path.join(os.path.dirname(__file__), csv_path)
with open(csv_path, "r") as fd:
for row in csv.DictReader(fd):
row["calories"] = int(row["calories"])
lines.append(row)
context.log.info("Read {n_lines} lines".format(n_lines=len(lines)))
return lines
#solid(
config_schema={
"process_hot": Field(Bool, is_required=False, default_value=True),
"process_cold": Field(Bool, is_required=False, default_value=True),
},
output_defs=[
OutputDefinition(name="hot_cereals", is_required=False),
OutputDefinition(name="cold_cereals", is_required=False),
],
)
def split_cereals(context, cereals):
if context.solid_config["process_hot"]:
hot_cereals = [cereal for cereal in cereals if cereal["type"] == "H"]
yield Output(hot_cereals, "hot_cereals")
if context.solid_config["process_cold"]:
cold_cereals = [cereal for cereal in cereals if cereal["type"] == "C"]
yield Output(cold_cereals, "cold_cereals")
You can also find the whole lines of codes from here.
Try first to install the dagster pandas integration:
pip install dagster_pandas
Then do:
from dagster_pandas import DataFrame
You can find the code from the tutorial here.

run the same method of a list of instances in pathos.multiprocessing

I am working on a traveling salesman problem. Given that all agents traverse the same graph to find their own path separately, i am trying to parallelize the path-finding action of agents. the task is for each iteration, all agents start from a start node to find their paths and collect all the paths to find the best path in the current iteration.
I am using pathos.multiprocessing.
the agent class has a traverse method as,
class Agent:
def find_a_path(self, graph):
# here is the logic to find a path by traversing the graph
return found_path
I create a helper function to wrap up the method
def do_agent_find_a_path(agent, graph):
return agent.find_a_path(graph)
then create a pool and employ amap by passing the helper function, a list of agent instance and the same graph,
pool = ProcessPool(nodes = 10)
res = pool.amap(do_agent_find_a_path, agents, [graph] * len(agents))
but, the processes are created in sequence and it runs very slow. I'd like to have some instructions on a correct/decent way to leverage pathos in this situation.
thank you!
UPDATE:
I am using pathos 0.2.3 on ubuntu,
Name: pathos
Version: 0.2.3
Summary: parallel graph management and execution in heterogeneous computing
Home-page: https://pypi.org/project/pathos
Author: Mike McKerns
i get the following error with the TreadPool sample code:
>import pathos
>pathos.pools.ThreadPool().iumap(lambda x:x*x, [1,2,3,4])
Traceback (most recent call last):
File "/opt/anaconda/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-5-f8f5e7774646>", line 1, in <module>
pathos.pools.ThreadPool().iumap(lambda x:x*x, [1,2,3,4])
AttributeError: 'ThreadPool' object has no attribute 'iumap'```
I'm the pathos author. I'm not sure how long your method takes to run, but from your comments, I'm going to assume not very long. I'd suggest that, if the method is "fast", that you use a ThreadPool instead. Also, if you don't need to preserve the order of the results, the fastest map is typically uimap (unordered, iterative map).
>>> class Agent:
... def basepath(self, dirname):
... import os
... return os.path.basename(dirname)
... def slowpath(self, dirname):
... import time
... time.sleep(.2)
... return self.basepath(dirname)
...
>>> a = Agent()
>>> import pathos.pools as pp
>>> dirs = ['/tmp/foo', '/var/path/bar', '/root/bin/bash', '/tmp/foo/bar']
>>> import time
>>> p = pp.ProcessPool()
>>> go = time.time(); tuple(p.uimap(a.basepath, dirs)); print(time.time()-go)
('foo', 'bar', 'bash', 'bar')
0.006751060485839844
>>> p.close(); p.join(); p.clear()
>>> t = pp.ThreadPool(4)
>>> go = time.time(); tuple(t.uimap(a.basepath, dirs)); print(time.time()-go)
('foo', 'bar', 'bash', 'bar')
0.0005156993865966797
>>> t.close(); t.join(); t.clear()
and, just to compare against something that takes a bit longer...
>>> t = pp.ThreadPool(4)
>>> go = time.time(); tuple(t.uimap(a.slowpath, dirs)); print(time.time()-go)
('bar', 'bash', 'bar', 'foo')
0.2055649757385254
>>> t.close(); t.join(); t.clear()
>>> p = pp.ProcessPool()
>>> go = time.time(); tuple(p.uimap(a.slowpath, dirs)); print(time.time()-go)
('foo', 'bar', 'bash', 'bar')
0.2084510326385498
>>> p.close(); p.join(); p.clear()
>>>

Robot Framework Variable Class File with triple Nested Dictionary is not dot notation accessible

Using the Robot Framework Documentation on Variable Files as a guide I implemented a Variable File Class with the get_variables. The basic example works as described.
When I implement a triple nested Dictionary (${A.B.C}) I can access the first two using ${A} and ${A.B} notation. However, when I want to access the third node ${A.B.C} the result is that I get an error
Resolving variable '${A.B.C}' failed: AttributeError: 'OrderedDict' object
has no attribute 'C'
In the below three examples I have an RF generated nested dictionary that I can access all nodes through the dot notation. The second example is a plain Python dictionary that is given back by the Variable Class. In the last example the Variable class returned is of the type OrderedDict.
Although the ${A['B']['C']['key']} works, it is more difficult to read in the code. When I load a similar yaml structure it supports the dot notation fully but this is not an option as the yaml file is static, and I require flexibility of Python for some of the key values.
So, I'm looking for some support on how to return a data structure that allows Robot Framework to interpret the full nested structure with the dot notation.
Variable Class File
from collections import OrderedDict
class OrderDict(object):
def get_variables(self):
C = OrderedDict([(u'key', u'value')])
B = OrderedDict([(u'C', C)])
A = OrderedDict([(u'B', B)])
D = {
u'E':
{
u'F':
{
u'key': u'value'
}
}
}
return OrderedDict([(u'DICT__A', A), (u'DICT__D', D)])
Robot Framework Script
*** Test Cases ***
Dictionary RF
${Z} Create Dictionary key=value
${Y} Create Dictionary Z=${Z}
${X} Create Dictionary Y=${Y}
Log To Console ${EMPTY}
Log To Console ${X}
Log To Console ${X['Y']['Z']['key']}
Log To Console ${X.Y}
Log To Console ${X.Y.Z}
Log To Console ${X.Y.Z.key}
Plain Dictionary Variable Class
Log To Console ${EMPTY}
Log To Console ${D}
Log To Console ${D['E']['F']['key']}
Log To Console ${D.E}
Log To Console ${D.E.F}
Log To Console ${D.E.F.key}
Ordered Dictionary Variable Class
Log To Console ${EMPTY}
Log To Console ${A}
Log To Console ${A['B']['C']['key']}
Log To Console ${A.B}
Log To Console ${A.B.C}
Log To Console ${A.B.C.key}
Robot Framework Console Log
Suite Executor: Robot Framework 3.0.2 (Python 2.7.9 on win32)
Dictionary RF
{u'Y': {u'Z': {u'key': u'value'}}}
value
{u'Z': {u'key': u'value'}}
{u'key': u'value'}
value
| PASS |
------------------------------------------------------------------------------
Plain Dictionary Variable Class
{u'E': {u'F': {u'key': u'value'}}}
value
{u'F': {u'key': u'value'}}
| FAIL |
Resolving variable '${D.E.F}' failed: AttributeError: 'dict' object has no attribute 'F'
------------------------------------------------------------------------------
Ordered Dictionary Variable Class
{u'B': OrderedDict([(u'C', OrderedDict([(u'key', u'value')]))])}
value
OrderedDict([(u'C', OrderedDict([(u'key', u'value')]))])
| FAIL |
Resolving variable '${A.B.C}' failed: AttributeError: 'OrderedDict' object has no
attribute 'C'
In the Robot Framework Slack channel Pekka Klarck pointed out that Robot Framework internally uses the robot.utils.DotDic class. Having get_variables() return a DotDic structure resolved my issue and I can now use the dot notation. Below is the code for the Variable Class DotDic (stored as DotDic.py).
from robot.utils import DotDict
class DotDic(object):
def get_variables(self):
G = {
u'H':
{
u'I':
{
u'key': u'value'
}
}
}
return {u'G': self.dict_to_dotdict(G)}
def dict_to_dotdict(self, dct):
dd = DotDict({})
for key, val in dct.items():
if isinstance(val, dict):
dd[key] = self.dict_to_dotdict(val)
else:
dd[key] = val
return dd

How to get sys.exc_traceback form IPython shell.run_code?

My app interfaces with the IPython Qt shell with code something like this:
from IPython.core.interactiveshell import ExecutionResult
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
result = ExecutionResult()
shell.run_code(code, result=result)
if result:
self.show_result(result)
The problem is: how can show_result show the traceback resulting from exceptions in code?
Neither the error_before_exec nor the error_in_exec ivars of ExecutionResult seem to give references to the traceback. Similarly, neither sys nor shell.user_ns.namespace.get('sys') have sys.exc_traceback attributes.
Any ideas? Thanks!
Edward
IPython/core/interactiveshell.py contains InteractiveShell._showtraceback:
def _showtraceback(self, etype, evalue, stb):
"""Actually show a traceback. Subclasses may override..."""
print(self.InteractiveTB.stb2text(stb), file=io.stdout)
The solution is to monkey-patch IS._showtraceback so that it writes to sys.stdout (the Qt console):
from __future__ import print_function
...
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
def show_traceback(etype, evalue, stb, shell=shell):
print(shell.InteractiveTB.stb2text(stb), file=sys.stderr)
sys.stderr.flush() # <==== Oh, so important
old_show = getattr(shell, '_showtraceback', None)
shell._showtraceback = show_traceback
shell.run_code(code)
if old_show: shell._showtraceback = old_show
Note: there is no need to pass an ExecutionResult object to shell.run_code().
EKR

Create a portal_user_catalog and have it used (Plone)

I'm creating a fork of my Plone site (which has not been forked for a long time). This site has a special catalog object for user profiles (a special Archetypes-based object type) which is called portal_user_catalog:
$ bin/instance debug
>>> portal = app.Plone
>>> print [d for d in portal.objectMap() if d['meta_type'] == 'Plone Catalog Tool']
[{'meta_type': 'Plone Catalog Tool', 'id': 'portal_catalog'},
{'meta_type': 'Plone Catalog Tool', 'id': 'portal_user_catalog'}]
This looks reasonable because the user profiles don't have most of the indexes of the "normal" objects, but have a small set of own indexes.
Since I found no way how to create this object from scratch, I exported it from the old site (as portal_user_catalog.zexp) and imported it in the new site. This seemed to work, but I can't add objects to the imported catalog, not even by explicitly calling the catalog_object method. Instead, the user profiles are added to the standard portal_catalog.
Now I found a module in my product which seems to serve the purpose (Products/myproduct/exportimport/catalog.py):
"""Catalog tool setup handlers.
$Id: catalog.py 77004 2007-06-24 08:57:54Z yuppie $
"""
from Products.GenericSetup.utils import exportObjects
from Products.GenericSetup.utils import importObjects
from Products.CMFCore.utils import getToolByName
from zope.component import queryMultiAdapter
from Products.GenericSetup.interfaces import IBody
def importCatalogTool(context):
"""Import catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog')
parent_path=''
if obj and not obj():
importer = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
__traceback_info__ = path
print [importer]
if importer:
print importer.name
if importer.name:
path = '%s%s' % (parent_path, 'usercatalog')
print path
filename = '%s%s' % (path, importer.suffix)
print filename
body = context.readDataFile(filename)
if body is not None:
importer.filename = filename # for error reporting
importer.body = body
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
importObjects(sub, path+'/', context)
def exportCatalogTool(context):
"""Export catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog', None)
if tool is None:
logger = context.getLogger('catalog')
logger.info('Nothing to export.')
return
parent_path=''
exporter = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
if exporter:
if exporter.name:
path = '%s%s' % (parent_path, 'usercatalog')
filename = '%s%s' % (path, exporter.suffix)
body = exporter.body
if body is not None:
context.writeDataFile(filename, body, exporter.mime_type)
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
exportObjects(sub, path+'/', context)
I tried to use it, but I have no idea how it is supposed to be done;
I can't call it TTW (should I try to publish the methods?!).
I tried it in a debug session:
$ bin/instance debug
>>> portal = app.Plone
>>> from Products.myproduct.exportimport.catalog import exportCatalogTool
>>> exportCatalogTool(portal)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File ".../Products/myproduct/exportimport/catalog.py", line 58, in exportCatalogTool
site = context.getSite()
AttributeError: getSite
So, if this is the way to go, it looks like I need a "real" context.
Update: To get this context, I tried an External Method:
# -*- coding: utf-8 -*-
from Products.myproduct.exportimport.catalog import exportCatalogTool
from pdb import set_trace
def p(dt, dd):
print '%-16s%s' % (dt+':', dd)
def main(self):
"""
Export the portal_user_catalog
"""
g = globals()
print '#' * 79
for a in ('__package__', '__module__'):
if a in g:
p(a, g[a])
p('self', self)
set_trace()
exportCatalogTool(self)
However, wenn I called it, I got the same <PloneSite at /Plone> object as the argument to the main function, which didn't have the getSite attribute. Perhaps my site doesn't call such External Methods correctly?
Or would I need to mention this module somehow in my configure.zcml, but how? I searched my directory tree (especially below Products/myproduct/profiles) for exportimport, the module name, and several other strings, but I couldn't find anything; perhaps there has been an integration once but was broken ...
So how do I make this portal_user_catalog work?
Thank you!
Update: Another debug session suggests the source of the problem to be some transaction matter:
>>> portal = app.Plone
>>> puc = portal.portal_user_catalog
>>> puc._catalog()
[]
>>> profiles_folder = portal.some_folder_with_profiles
>>> for o in profiles_folder.objectValues():
... puc.catalog_object(o)
...
>>> puc._catalog()
[<Products.ZCatalog.Catalog.mybrains object at 0x69ff8d8>, ...]
This population of the portal_user_catalog doesn't persist; after termination of the debug session and starting fg, the brains are gone.
It looks like the problem was indeed related with transactions.
I had
import transaction
...
class Browser(BrowserView):
...
def processNewUser(self):
....
transaction.commit()
before, but apparently this was not good enough (and/or perhaps not done correctly).
Now I start the transaction explicitly with transaction.begin(), save intermediate results with transaction.savepoint(), abort the transaction explicitly with transaction.abort() in case of errors (try / except), and have exactly one transaction.commit() at the end, in the case of success. Everything seems to work.
Of course, Plone still doesn't take this non-standard catalog into account; when I "clear and rebuild" it, it is empty afterwards. But for my application it works well enough.

Resources