I need to find all python file in folder excluding __init__.py
My first attempt was
import re
search_path.rglob(re.compile("(?!__init__).*.py"))
Such code fails, so i end up with:
filter(
lambda path: '__init__.py' != path.name and path.name.endswith('.py') and path.is_file(), search_path.rglob("*.py")
)
Looks like rglob does not support python regexps.
Why?
Does rglob supports negative patterns?
Can this code be more elegant?
I need something very much like this too. I came up with this:
import pathlib
search_path = pathlib.Path.cwd() / "test_folder"
for file in filter(lambda item: item.name != "__init__.py", search_path.rglob("./*.py")):
print(f"{file}")
Alternatively, you can use fnmatch.
import pathlib
import fnmatch
search_path = pathlib.Path.cwd() / "test_folder"
for file in search_path.rglob("./*.py"):
if not fnmatch.fnmatch(file.name, "__init__.py"):
print(f"{file}")
Related
I'm using Hydra for training machine learning models. It's great for doing complex commands like python train.py data=MNIST batch_size=64 loss=l2. However, if I want to then run the trained model with the same parameters, I have to do something like python reconstruct.py --config_file path_to_previous_job/.hydra/config.yaml. I then use argparse to load in the previous yaml and use the compose API to initialize the Hydra environment. The path to the trained model is inferred from the path to Hydra's .yaml file. If I want to modify one of the parameters, I have to add additional argparse parameters and run something like python reconstruct.py --config_file path_to_previous_job/.hydra/config.yaml --batch_size 128. The code then manually overrides any Hydra parameters with those that were specified on the command line.
What's the right way of doing this?
My current code looks something like the following:
train.py:
import hydra
#hydra.main(config_name="config", config_path="conf")
def main(cfg):
# [training code using cfg.data, cfg.batch_size, cfg.loss etc.]
# [code outputs model checkpoint to job folder generated by Hydra]
main()
reconstruct.py:
import argparse
import os
from hydra.experimental import initialize, compose
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('hydra_config')
parser.add_argument('--batch_size', type=int)
# [other flags and parameters I may need to override]
args = parser.parse_args()
# Create the Hydra environment.
initialize()
cfg = compose(config_name=args.hydra_config)
# Since checkpoints are stored next to the .hydra, we manually generate the path.
checkpoint_dir = os.path.dirname(os.path.dirname(args.hydra_config))
# Manually override any parameters which can be changed on the command line.
batch_size = args.batch_size if args.batch_size else cfg.data.batch_size
# [code which uses checkpoint_dir to load the model]
# [code which uses both batch_size and params in cfg to set up the data etc.]
This is my first time posting, so let me know if I should clarify anything.
If you want to load the previous config as is and not change it, use OmegaConf.load(file_path).
If you want to re-compose the config (and it sounds like you do, because you added that you want override things), I recommend that you use the Compose API and pass in parameters from the overrides file in the job output directory (next to the stored config.yaml), but concatenate the current run parameters.
This script seems to be doing the job:
import os
from dataclasses import dataclass
from os.path import join
from typing import Optional
from omegaconf import OmegaConf
import hydra
from hydra import compose
from hydra.core.config_store import ConfigStore
from hydra.core.hydra_config import HydraConfig
from hydra.utils import to_absolute_path
# You can also use a yaml config file instead of this Structured Config
#dataclass
class Config:
load_checkpoint: Optional[str] = None
batch_size: int = 16
loss: str = "l2"
cs = ConfigStore.instance()
cs.store(name="config", node=Config)
#hydra.main(config_path=".", config_name="config")
def my_app(cfg: Config) -> None:
if cfg.load_checkpoint is not None:
output_dir = to_absolute_path(cfg.load_checkpoint)
original_overrides = OmegaConf.load(join(output_dir, ".hydra/overrides.yaml"))
current_overrides = HydraConfig.get().overrides.task
hydra_config = OmegaConf.load(join(output_dir, ".hydra/hydra.yaml"))
# getting the config name from the previous job.
config_name = hydra_config.hydra.job.config_name
# concatenating the original overrides with the current overrides
overrides = original_overrides + current_overrides
# compose a new config from scratch
cfg = compose(config_name, overrides=overrides)
# train
print("Running in ", os.getcwd())
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
~/tmp$ python train.py
Running in /home/omry/tmp/outputs/2021-04-19/21-23-13
load_checkpoint: null
batch_size: 16
loss: l2
~/tmp$ python train.py load_checkpoint=/home/omry/tmp/outputs/2021-04-19/21-23-13
Running in /home/omry/tmp/outputs/2021-04-19/21-23-22
load_checkpoint: /home/omry/tmp/outputs/2021-04-19/21-23-13
batch_size: 16
loss: l2
~/tmp$ python train.py load_checkpoint=/home/omry/tmp/outputs/2021-04-19/21-23-13 batch_size=32
Running in /home/omry/tmp/outputs/2021-04-19/21-23-28
load_checkpoint: /home/omry/tmp/outputs/2021-04-19/21-23-13
batch_size: 32
loss: l2
I'm trying to get some technical ind. with some of those commands in this link:https://github.com/enigmampc/catalyst/blob/master/catalyst/pipeline/factors/equity/technical.py,
but in the quant.notebook I'm not able to get "from numexpr import evaluate", so evaluate is not defined.
How can I solve this?
from numexpr import evaluate
class FastochasticOscillator(CustomFactor):
inputs=(USEquityPricing.close,USEquityPricing.high,USEquityPricing.low)
window_safe=True
window_length=14
def compute(self, today, assets, out, closes, highs, lows):
highest_high= nanmax(highs, axis=0)
lowest_low= nanmin(lows, axis=0)
latest_close= closes[-1]
evaluate(
'((tc - ll) / (hh - ll)) * 100',
local_dict={
'tc':latest_close,
'll':lowest_low,
'hh':highest_high,
},
global_dict={},
out=out,
)
K= FastochasticOscillator(window_length=14)
return Pipeline(columns={
'K':K,
},screen=base)
I'm working on the Quantopian notebook and when I attempt to import it gives me this: InputRejected: Importing evaluate from numexpr raised an ImportError. Did you mean to import errstate from numpy?
Actually I do not find a way to import numexpr on Quantopian but on Jupyter it do not give problems. Therefore the problem is related to the online IDE. Moreover, I simply re-write the FastOsc ind. in another way to use it inside the pipeline in the quantopian online IDE.
class Fast(CustomFactor):
inputs=(USEquityPricing.close,USEquityPricing.high,USEquityPricing.low)
window_length=14
def compute(self, today, assets, out, close, high, low):
highest_high= nanmax(high, axis=0)
lowest_low= nanmin(low, axis=0)
latest_close= close[-1]
out[:]= ((latest_close - lowest_low) / (highest_high - lowest_low)*100)
I want to take test case results from Robot Framework runs and import those results into other tools (ElasticSearch, ALM tools, etc).
Towards that end I would like to be able to generate a text file with one line per test. Here is an example line pipe delimited:
testcase name | time run | duration | status
There are other fields I would add but those are the basic ones. Any help appreciated. I have been looking at robot.result http://robot-framework.readthedocs.io/en/3.0.2/autodoc/robot.result.html but haven't figured it out yet. If/when I do I will post answer here.
Thanks,
The output.xml file is very easy to parse with normal XML parsing libraries.
Here's a quick example:
from __future__ import print_function
import xml.etree.ElementTree as ET
from datetime import datetime
def get_robot_results(filepath):
results = []
with open(filepath, "r") as f:
xml = ET.parse(f)
root = xml.getroot()
if root.tag != "robot":
raise Exception("expect root tag 'robot', got '%s'" % root.tag)
for suite_node in root.findall("suite"):
for test_node in suite_node.findall("test"):
status_node = test_node.find("status")
name = test_node.attrib["name"]
status = status_node.attrib["status"]
start = status_node.attrib["starttime"]
end = status_node.attrib["endtime"]
start_time = datetime.strptime(start, '%Y%m%d %H:%M:%S.%f')
end_time = datetime.strptime(end, '%Y%m%d %H:%M:%S.%f')
elapsed = str(end_time-start_time)
results.append([name, start, elapsed, status])
return results
if __name__ == "__main__":
results = get_robot_results("output.xml")
for row in results:
print(" | ".join(row))
Bryan is right that it's easy to parse Robot's output.xml using standard XML parsing modules. Alternatively you can use Robot's own result parsing modules and the model you get from it:
from robot.api import ExecutionResult, SuiteVisitor
class PrintTestInfo(SuiteVisitor):
def visit_test(self, test):
print('{} | {} | {} | {}'.format(test.name, test.starttime,
test.elapsedtime, test.status))
result = ExecutionResult('output.xml')
result.suite.visit(PrintTestInfo())
For more details about the APIs used above see http://robot-framework.readthedocs.io/.
My app interfaces with the IPython Qt shell with code something like this:
from IPython.core.interactiveshell import ExecutionResult
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
result = ExecutionResult()
shell.run_code(code, result=result)
if result:
self.show_result(result)
The problem is: how can show_result show the traceback resulting from exceptions in code?
Neither the error_before_exec nor the error_in_exec ivars of ExecutionResult seem to give references to the traceback. Similarly, neither sys nor shell.user_ns.namespace.get('sys') have sys.exc_traceback attributes.
Any ideas? Thanks!
Edward
IPython/core/interactiveshell.py contains InteractiveShell._showtraceback:
def _showtraceback(self, etype, evalue, stb):
"""Actually show a traceback. Subclasses may override..."""
print(self.InteractiveTB.stb2text(stb), file=io.stdout)
The solution is to monkey-patch IS._showtraceback so that it writes to sys.stdout (the Qt console):
from __future__ import print_function
...
shell = self.kernelApp.shell # ZMQInteractiveShell
code = compile(script, file_name, 'exec')
def show_traceback(etype, evalue, stb, shell=shell):
print(shell.InteractiveTB.stb2text(stb), file=sys.stderr)
sys.stderr.flush() # <==== Oh, so important
old_show = getattr(shell, '_showtraceback', None)
shell._showtraceback = show_traceback
shell.run_code(code)
if old_show: shell._showtraceback = old_show
Note: there is no need to pass an ExecutionResult object to shell.run_code().
EKR
I'm trying to define parsers in a build.sbt file.
I'm using this plugin, by adding this line in plugins.sbt:
addSbtPlugin("com.gilt" % "sbt-dependency-graph-sugar" % "0.7.4")
When specifiying this in build.sbt:
import sbt.complete.DefaultParsers._
val servers = token(
literal("desarrollo") |
literal("parametrizacion")
)
SBT complains with this error message:
reference to literal is ambiguous;
reference to token is ambiguous;
it is imported twice in the same scope by
import sbt.complete.DefaultParsers._
and import _root_.gilt.DependencyGraph._
How can I avoid this namespace clashing of basic SBT classes?.
One solution is this:
import sbt.complete.{DefaultParsers ⇒ DP}
import sbt.complete.DefaultParsers._
val servers = DP.token(
DP.literal("desarrollo") |
DP.literal("parametrizacion")
)
I don't fully like it, because it adds clutter.
The ideal solution is to hide unwanted imports.
This solution creates a new scope, so there's no interference with imports:
name := "MyProject"
{
import sbt.complete.DefaultParsers._
val servers = token(
"desarrollo" | "parametrizacion"
)
}