How to load Hydra parameters from previous jobs (without having to use argparse and the compose API)? - fb-hydra

I'm using Hydra for training machine learning models. It's great for doing complex commands like python train.py data=MNIST batch_size=64 loss=l2. However, if I want to then run the trained model with the same parameters, I have to do something like python reconstruct.py --config_file path_to_previous_job/.hydra/config.yaml. I then use argparse to load in the previous yaml and use the compose API to initialize the Hydra environment. The path to the trained model is inferred from the path to Hydra's .yaml file. If I want to modify one of the parameters, I have to add additional argparse parameters and run something like python reconstruct.py --config_file path_to_previous_job/.hydra/config.yaml --batch_size 128. The code then manually overrides any Hydra parameters with those that were specified on the command line.
What's the right way of doing this?
My current code looks something like the following:
train.py:
import hydra
#hydra.main(config_name="config", config_path="conf")
def main(cfg):
# [training code using cfg.data, cfg.batch_size, cfg.loss etc.]
# [code outputs model checkpoint to job folder generated by Hydra]
main()
reconstruct.py:
import argparse
import os
from hydra.experimental import initialize, compose
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('hydra_config')
parser.add_argument('--batch_size', type=int)
# [other flags and parameters I may need to override]
args = parser.parse_args()
# Create the Hydra environment.
initialize()
cfg = compose(config_name=args.hydra_config)
# Since checkpoints are stored next to the .hydra, we manually generate the path.
checkpoint_dir = os.path.dirname(os.path.dirname(args.hydra_config))
# Manually override any parameters which can be changed on the command line.
batch_size = args.batch_size if args.batch_size else cfg.data.batch_size
# [code which uses checkpoint_dir to load the model]
# [code which uses both batch_size and params in cfg to set up the data etc.]
This is my first time posting, so let me know if I should clarify anything.

If you want to load the previous config as is and not change it, use OmegaConf.load(file_path).
If you want to re-compose the config (and it sounds like you do, because you added that you want override things), I recommend that you use the Compose API and pass in parameters from the overrides file in the job output directory (next to the stored config.yaml), but concatenate the current run parameters.
This script seems to be doing the job:
import os
from dataclasses import dataclass
from os.path import join
from typing import Optional
from omegaconf import OmegaConf
import hydra
from hydra import compose
from hydra.core.config_store import ConfigStore
from hydra.core.hydra_config import HydraConfig
from hydra.utils import to_absolute_path
# You can also use a yaml config file instead of this Structured Config
#dataclass
class Config:
load_checkpoint: Optional[str] = None
batch_size: int = 16
loss: str = "l2"
cs = ConfigStore.instance()
cs.store(name="config", node=Config)
#hydra.main(config_path=".", config_name="config")
def my_app(cfg: Config) -> None:
if cfg.load_checkpoint is not None:
output_dir = to_absolute_path(cfg.load_checkpoint)
original_overrides = OmegaConf.load(join(output_dir, ".hydra/overrides.yaml"))
current_overrides = HydraConfig.get().overrides.task
hydra_config = OmegaConf.load(join(output_dir, ".hydra/hydra.yaml"))
# getting the config name from the previous job.
config_name = hydra_config.hydra.job.config_name
# concatenating the original overrides with the current overrides
overrides = original_overrides + current_overrides
# compose a new config from scratch
cfg = compose(config_name, overrides=overrides)
# train
print("Running in ", os.getcwd())
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
~/tmp$ python train.py
Running in /home/omry/tmp/outputs/2021-04-19/21-23-13
load_checkpoint: null
batch_size: 16
loss: l2
~/tmp$ python train.py load_checkpoint=/home/omry/tmp/outputs/2021-04-19/21-23-13
Running in /home/omry/tmp/outputs/2021-04-19/21-23-22
load_checkpoint: /home/omry/tmp/outputs/2021-04-19/21-23-13
batch_size: 16
loss: l2
~/tmp$ python train.py load_checkpoint=/home/omry/tmp/outputs/2021-04-19/21-23-13 batch_size=32
Running in /home/omry/tmp/outputs/2021-04-19/21-23-28
load_checkpoint: /home/omry/tmp/outputs/2021-04-19/21-23-13
batch_size: 32
loss: l2

Related

Running all python unittests from different modules as a suite having parameterized class constructor

I am working on a design pattern to make my python unittest as a POM, so far I have written my page classes in modules HomePageObject.py,FilterPageObject.py, my base class (for common stuff)TestBase in BaseTest.py, my testcase modules are TestCase1.py and TestCase2.py and one runner module runner.py.
In runner class i am using loader.getTestCaseNames to get all the tests from a testcase class of a module. In both the testcase modules the name of the test class is same 'Test' and also the method name is same 'testName'
Since the names are confilicting while importing it in runner, only one test is getting executed. I want python to scan all the modules that i specify for tests in them and run those even the name of classes are same.
I got to know that nose might be helpful in this, but not sure how i can implement it here. Any advice ?
BaseTest.py
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ChromeOptions
import unittest
class TestBase(unittest.TestCase):
driver = None
def __init__(self,testName,browser):
self.browser = browser
super(TestBase,self).__init__(testName)
def setUp(self):
if self.browser == "firefox":
TestBase.driver = webdriver.Firefox()
elif self.browser == "chrome":
options = ChromeOptions()
options.add_argument("--start-maximized")
TestBase.driver = webdriver.Chrome(chrome_options=options)
self.url = "https://www.airbnb.co.in/"
self.driver = TestBase.getdriver()
TestBase.driver.implicitly_wait(10)
def tearDown(self):
self.driver.quit()
#staticmethod
def getdriver():
return TestBase.driver
#staticmethod
def waitForElementVisibility(locator, expression, message):
try:
WebDriverWait(TestBase.driver, 20).\
until(EC.presence_of_element_located((locator, expression)),
message)
return True
except:
return False
TestCase1.py and TestCase2.py (same)
from airbnb.HomePageObject import HomePage
from airbnb.BaseTest import TestBase
class Test(TestBase):
def __init__(self,testName,browser):
super(Test,self).__init__(testName,browser)
def testName(self):
try:
self.driver.get(self.url)
h_page = HomePage()
f_page = h_page.seachPlace("Sicily,Italy")
f_page.selectExperience()
finally:
self.driver.quit()
runner.py
import unittest
from airbnb.TestCase1 import Test
from airbnb.TestCase2 import Test
loader = unittest.TestLoader()
test_names = loader.getTestCaseNames(Test)
suite = unittest.TestSuite()
for test in test_names:
suite.addTest(Test(test,"chrome"))
runner = unittest.TextTestRunner()
result = runner.run(suite)
Also even that one test case is getting passed, some error message is coming
Ran 1 test in 9.734s
OK
Traceback (most recent call last):
File "F:\eclipse-jee-neon-3-win32\eclipse\plugins\org.python.pydev.core_6.3.3.201805051638\pysrc\runfiles.py", line 275, in <module>
main()
File "F:\eclipse-jee-neon-3-win32\eclipse\plugins\org.python.pydev.core_6.3.3.201805051638\pysrc\runfiles.py", line 97, in main
return pydev_runfiles.main(configuration) # Note: still doesn't return a proper value.
File "F:\eclipse-jee-neon-3-win32\eclipse\plugins\org.python.pydev.core_6.3.3.201805051638\pysrc\_pydev_runfiles\pydev_runfiles.py", line 874, in main
PydevTestRunner(configuration).run_tests()
File "F:\eclipse-jee-neon-3-win32\eclipse\plugins\org.python.pydev.core_6.3.3.201805051638\pysrc\_pydev_runfiles\pydev_runfiles.py", line 773, in run_tests
all_tests = self.find_tests_from_modules(file_and_modules_and_module_name)
File "F:\eclipse-jee-neon-3-win32\eclipse\plugins\org.python.pydev.core_6.3.3.201805051638\pysrc\_pydev_runfiles\pydev_runfiles.py", line 629, in find_tests_from_modules
suite = loader.loadTestsFromModule(m)
File "C:\Python27\lib\unittest\loader.py", line 65, in loadTestsFromModule
tests.append(self.loadTestsFromTestCase(obj))
File "C:\Python27\lib\unittest\loader.py", line 56, in loadTestsFromTestCase
loaded_suite = self.suiteClass(map(testCaseClass, testCaseNames))
TypeError: __init__() takes exactly 3 arguments (2 given)
I did this by searching for all the modules of test classes with a pattern and then used __import__(modulename) and called its Test class with desired parameters,
Here is my runner.py
import unittest
import glob
loader = unittest.TestLoader()
suite = unittest.TestSuite()
test_file_strings = glob.glob('Test*.py')
module_strings = [str[0:len(str)-3] for str in test_file_strings]
for module in module_strings:
mod = __import__(module)
test_names =loader.getTestCaseNames(mod.Test)
for test in test_names:
suite.addTest(mod.Test(test,"chrome"))
runner = unittest.TextTestRunner()
result = runner.run(suite)
This worked but still looking for some organized solutions.
(Not sure why second time its showing Ran 0 tests in 0.000s )
Finding files... done.
Importing test modules ... ..done.
----------------------------------------------------------------------
Ran 2 tests in 37.491s
OK
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK

How can I get a one line per test result with Robot Framework?

I want to take test case results from Robot Framework runs and import those results into other tools (ElasticSearch, ALM tools, etc).
Towards that end I would like to be able to generate a text file with one line per test. Here is an example line pipe delimited:
testcase name | time run | duration | status
There are other fields I would add but those are the basic ones. Any help appreciated. I have been looking at robot.result http://robot-framework.readthedocs.io/en/3.0.2/autodoc/robot.result.html but haven't figured it out yet. If/when I do I will post answer here.
Thanks,
The output.xml file is very easy to parse with normal XML parsing libraries.
Here's a quick example:
from __future__ import print_function
import xml.etree.ElementTree as ET
from datetime import datetime
def get_robot_results(filepath):
results = []
with open(filepath, "r") as f:
xml = ET.parse(f)
root = xml.getroot()
if root.tag != "robot":
raise Exception("expect root tag 'robot', got '%s'" % root.tag)
for suite_node in root.findall("suite"):
for test_node in suite_node.findall("test"):
status_node = test_node.find("status")
name = test_node.attrib["name"]
status = status_node.attrib["status"]
start = status_node.attrib["starttime"]
end = status_node.attrib["endtime"]
start_time = datetime.strptime(start, '%Y%m%d %H:%M:%S.%f')
end_time = datetime.strptime(end, '%Y%m%d %H:%M:%S.%f')
elapsed = str(end_time-start_time)
results.append([name, start, elapsed, status])
return results
if __name__ == "__main__":
results = get_robot_results("output.xml")
for row in results:
print(" | ".join(row))
Bryan is right that it's easy to parse Robot's output.xml using standard XML parsing modules. Alternatively you can use Robot's own result parsing modules and the model you get from it:
from robot.api import ExecutionResult, SuiteVisitor
class PrintTestInfo(SuiteVisitor):
def visit_test(self, test):
print('{} | {} | {} | {}'.format(test.name, test.starttime,
test.elapsedtime, test.status))
result = ExecutionResult('output.xml')
result.suite.visit(PrintTestInfo())
For more details about the APIs used above see http://robot-framework.readthedocs.io/.

How to get sys.exc_traceback form IPython shell.run_code?

My app interfaces with the IPython Qt shell with code something like this:
from IPython.core.interactiveshell import ExecutionResult
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
result = ExecutionResult()
shell.run_code(code, result=result)
if result:
self.show_result(result)
The problem is: how can show_result show the traceback resulting from exceptions in code?
Neither the error_before_exec nor the error_in_exec ivars of ExecutionResult seem to give references to the traceback. Similarly, neither sys nor shell.user_ns.namespace.get('sys') have sys.exc_traceback attributes.
Any ideas? Thanks!
Edward
IPython/core/interactiveshell.py contains InteractiveShell._showtraceback:
def _showtraceback(self, etype, evalue, stb):
"""Actually show a traceback. Subclasses may override..."""
print(self.InteractiveTB.stb2text(stb), file=io.stdout)
The solution is to monkey-patch IS._showtraceback so that it writes to sys.stdout (the Qt console):
from __future__ import print_function
...
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
def show_traceback(etype, evalue, stb, shell=shell):
print(shell.InteractiveTB.stb2text(stb), file=sys.stderr)
sys.stderr.flush() # <==== Oh, so important
old_show = getattr(shell, '_showtraceback', None)
shell._showtraceback = show_traceback
shell.run_code(code)
if old_show: shell._showtraceback = old_show
Note: there is no need to pass an ExecutionResult object to shell.run_code().
EKR

Create a portal_user_catalog and have it used (Plone)

I'm creating a fork of my Plone site (which has not been forked for a long time). This site has a special catalog object for user profiles (a special Archetypes-based object type) which is called portal_user_catalog:
$ bin/instance debug
>>> portal = app.Plone
>>> print [d for d in portal.objectMap() if d['meta_type'] == 'Plone Catalog Tool']
[{'meta_type': 'Plone Catalog Tool', 'id': 'portal_catalog'},
{'meta_type': 'Plone Catalog Tool', 'id': 'portal_user_catalog'}]
This looks reasonable because the user profiles don't have most of the indexes of the "normal" objects, but have a small set of own indexes.
Since I found no way how to create this object from scratch, I exported it from the old site (as portal_user_catalog.zexp) and imported it in the new site. This seemed to work, but I can't add objects to the imported catalog, not even by explicitly calling the catalog_object method. Instead, the user profiles are added to the standard portal_catalog.
Now I found a module in my product which seems to serve the purpose (Products/myproduct/exportimport/catalog.py):
"""Catalog tool setup handlers.
$Id: catalog.py 77004 2007-06-24 08:57:54Z yuppie $
"""
from Products.GenericSetup.utils import exportObjects
from Products.GenericSetup.utils import importObjects
from Products.CMFCore.utils import getToolByName
from zope.component import queryMultiAdapter
from Products.GenericSetup.interfaces import IBody
def importCatalogTool(context):
"""Import catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog')
parent_path=''
if obj and not obj():
importer = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
__traceback_info__ = path
print [importer]
if importer:
print importer.name
if importer.name:
path = '%s%s' % (parent_path, 'usercatalog')
print path
filename = '%s%s' % (path, importer.suffix)
print filename
body = context.readDataFile(filename)
if body is not None:
importer.filename = filename # for error reporting
importer.body = body
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
importObjects(sub, path+'/', context)
def exportCatalogTool(context):
"""Export catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog', None)
if tool is None:
logger = context.getLogger('catalog')
logger.info('Nothing to export.')
return
parent_path=''
exporter = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
if exporter:
if exporter.name:
path = '%s%s' % (parent_path, 'usercatalog')
filename = '%s%s' % (path, exporter.suffix)
body = exporter.body
if body is not None:
context.writeDataFile(filename, body, exporter.mime_type)
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
exportObjects(sub, path+'/', context)
I tried to use it, but I have no idea how it is supposed to be done;
I can't call it TTW (should I try to publish the methods?!).
I tried it in a debug session:
$ bin/instance debug
>>> portal = app.Plone
>>> from Products.myproduct.exportimport.catalog import exportCatalogTool
>>> exportCatalogTool(portal)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File ".../Products/myproduct/exportimport/catalog.py", line 58, in exportCatalogTool
site = context.getSite()
AttributeError: getSite
So, if this is the way to go, it looks like I need a "real" context.
Update: To get this context, I tried an External Method:
# -*- coding: utf-8 -*-
from Products.myproduct.exportimport.catalog import exportCatalogTool
from pdb import set_trace
def p(dt, dd):
print '%-16s%s' % (dt+':', dd)
def main(self):
"""
Export the portal_user_catalog
"""
g = globals()
print '#' * 79
for a in ('__package__', '__module__'):
if a in g:
p(a, g[a])
p('self', self)
set_trace()
exportCatalogTool(self)
However, wenn I called it, I got the same <PloneSite at /Plone> object as the argument to the main function, which didn't have the getSite attribute. Perhaps my site doesn't call such External Methods correctly?
Or would I need to mention this module somehow in my configure.zcml, but how? I searched my directory tree (especially below Products/myproduct/profiles) for exportimport, the module name, and several other strings, but I couldn't find anything; perhaps there has been an integration once but was broken ...
So how do I make this portal_user_catalog work?
Thank you!
Update: Another debug session suggests the source of the problem to be some transaction matter:
>>> portal = app.Plone
>>> puc = portal.portal_user_catalog
>>> puc._catalog()
[]
>>> profiles_folder = portal.some_folder_with_profiles
>>> for o in profiles_folder.objectValues():
... puc.catalog_object(o)
...
>>> puc._catalog()
[<Products.ZCatalog.Catalog.mybrains object at 0x69ff8d8>, ...]
This population of the portal_user_catalog doesn't persist; after termination of the debug session and starting fg, the brains are gone.
It looks like the problem was indeed related with transactions.
I had
import transaction
...
class Browser(BrowserView):
...
def processNewUser(self):
....
transaction.commit()
before, but apparently this was not good enough (and/or perhaps not done correctly).
Now I start the transaction explicitly with transaction.begin(), save intermediate results with transaction.savepoint(), abort the transaction explicitly with transaction.abort() in case of errors (try / except), and have exactly one transaction.commit() at the end, in the case of success. Everything seems to work.
Of course, Plone still doesn't take this non-standard catalog into account; when I "clear and rebuild" it, it is empty afterwards. But for my application it works well enough.

Development Mode For uWSGI/Pylons (Reload new code)

I have a setup such that an nginx server passes control off to uWsgi, which launches a pylons app using the following in my xml configuration file:
<ini-paste>...</ini-paste>
Everything is working nicely, and I was able to set it to debug mode using the following in the associated ini file, like:
debug = true
Except debug mode only prints out errors, and doesn't reload the code everytime a file has been touched. If I was running directly through paste, I could use the --reload option, but going through uWsgi complicates things.
Does anybody know of a way to tell uWsgi to tell paste to set the --reload option, or to do this directly in the paste .ini file?
I used something like the following code to solve this, the monitorFiles(...) method is called on application initialization, and it monitors the files, sending the TERM signal when it sees a change.
I'd still much prefer a solution using paster's --reload argument, as I imagine this solution has bugs:
import os
import time
import signal
from deepthought.system import deployment
from multiprocessing.process import Process
def monitorFiles():
if deployment.getDeployment().dev and not FileMonitor.isRunning:
monitor = FileMonitor(os.getpid())
try: monitor.start()
except: print "Something went wrong..."
class FileMonitor(Process):
isRunning = False
def __init__(self, masterPid):
self.updates = {}
self.rootDir = deployment.rootDir() + "/src/python"
self.skip = len(self.rootDir)
self.masterPid = masterPid
FileMonitor.isRunning = True
Process.__init__(self)
def run(self):
while True:
self._loop()
time.sleep(5)
def _loop(self):
for root, _, files in os.walk(self.rootDir):
for file in files:
if file.endswith(".py"):
self._monitorFile(root, file)
def _monitorFile(self, root, file):
mtime = os.path.getmtime("%s/%s" % (root, file))
moduleName = "%s/%s" % (root[self.skip+1:], file[:-3])
moduleName = moduleName.replace("/",".")
if not moduleName in self.updates:
self.updates[moduleName] = mtime
elif self.updates[moduleName] < mtime:
print "Change detected in %s" % moduleName
self._restartWorker()
self.updates[moduleName] = mtime
def _restartWorker(self):
os.kill(self.masterPid, signal.SIGTERM)
Use the signal framework in 0.9.7 tree
http://projects.unbit.it/uwsgi/wiki/SignalFramework
An example of auto-reloading:
import uwsgi
uwsgi.register_signal(1, "", uwsgi.reload)
uwsgi.add_file_monitor(1, 'myfile.py')
def application(env, start_response):
...

Resources