how does hydra manages timezone when it creates output directories? - python-3.6

I have a .yaml configuration file which looks like:
key: value
hydra:
run:
dir: ./data_fetcher/hydra_outputs/${now:%Y-%m-%d}/${now:%H-%M-%S}
with a python file main.py:
import hydra
#hydra.main(config_path="data_fetcher/config", config_name="config")
def main(cfg: DictConfig):
pass
if __name__ == "__main__":
main()
When running main.py an output directory is created according to the current date and time.
How does Hydra get the current time ? is it possible to change the timezone ?

Hydra is registering a simple OmegaConf custom resolver here with a line like:
register("now", lambda pattern: strftime(pattern, localtime()))
You can register your own custom resolver with a different name before #hydra.main() runs that will do what you want.
Read about custom resolvers in the OmegaConf documentation.
Read about how to get time in a different time zone here.
You can also file a feature request to add support for time zone to the ${now:...} custom resolve in Hydra. a PR would be appreciated.
For example, ${now:...} can support an optional second parameter for the time zone.

Related

db.create_all() doesn't create a database in a desired directory

I am trying to create a database for my Flask application in the main directory of my project. This is my code for initializing a database:
app.config["SQLALCHEMY_DATABASE_URI"] = 'sqlite:///users.db'
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
db = SQLAlchemy(app)
Flask requires application context, so this is how I create the database:
$ flask shell
>>> db.create_all()
I also tried doing it with:
$ python
>>> from app import app, db
>>> app.app_context().push()
>>> db.create_all()
Both of these options create the database in the /instance directory. Is there any way to get around this and create it in the main directory of the project?
The instance path is the preferred and default location for the database. I recommend you to use this one for security reasons. However, it is also possible to choose an alternative solution in which the full length of the path is specified in the configuration.
The following configuration corresponds to an outdated variant, where the database is created in the current working directory. Please don't use this anymore.
app.config['SQLALCHEMY_DATABASE_URI'] ='sqlite:///' + os.path.join(os.getcwd(), 'users.db')
This corresponds to the current solution.
app.config['SQLALCHEMY_DATABASE_URI'] ='sqlite:///' + os.path.join(app.instance_path, 'users.db')

Airflow - Custom XCom backend on Ubuntu

I'm trying to implement custom XCOM backend.
Those are the steps I did:
Created "include" directory at the main Airflow dir (AIRFLOW_HOME).
Created these "custom_xcom_backend.py" file inside:
from typing import Any
from airflow.models.xcom import BaseXCom
import pandas as pd
class CustomXComBackend(BaseXCom):
#staticmethod
def serialize_value(value: Any):
if isinstance(value, pd.DataFrame):
value = value.to_json(orient='records')
return BaseXCom.serialize_value(value)
#staticmethod
def deserialize_value(result) -> Any:
result = BaseXCom.deserialize_value(result)
result = df = pd.read_json(result)
return result
Set at config file:
xcom_backend = include.custom_xcom_backend.CustomXComBackend
When I restarted webserver I got:
airflow.exceptions.AirflowConfigException: The object could not be loaded. Please check "xcom_backend" key in "core" section. Current value: "include.cust...
My guess is that it not recognizing the "include" folder
But how can I fix it?
*Note: There is no docker. It is installed on a Ubuntu machine.
Thanks!
So I solved it:
Put custom_xcom_backend.py into the plugins directory
set at config file:
xcom_backend = custom_xcom_backend.CustomXComBackend
Restart all airflow related services
*Note: Do not store DataFrames that way (bad practice).
Sources I used:
https://www.youtube.com/watch?v=iI0ymwOij88

How do I tell Dagit (the Dagster GUI) to run on an existing Dask cluster?

I'm using dagster 0.11.3 (the latest as of this writing)
I've created a Dagster pipeline (saved as pipeline.py) that looks like this:
#solid
def return_a(context):
return 12.34
#pipeline(
mode_defs=[
ModeDefinition(
executor_defs=[dask_executor] # Note: dask only!
)
]
)
def the_pipeline():
return_a()
I have the DAGSTER_HOME environment variable set to a directory that contains a file named dagster.yaml, which is an empty file. This should be ok because the defaults are reasonable based on these docs: https://docs.dagster.io/deployment/dagster-instance.
I have an existing Dask cluster running at "scheduler:8786". Based on these docs: https://docs.dagster.io/deployment/custom-infra/dask, I created a run config named config.yaml that looks like this:
execution:
dask:
config:
cluster:
existing:
address: "scheduler:8786"
I have SUCCESSFULLY used this run config with Dagster like so:
$ dagster pipeline execute -f pipeline.py -c config.yaml
(I checked the Dask logs and made sure that it did indeed run on my Dask cluster)
My question is: How can I get Dagit to use this Dask cluster?
The only thing I have found that seems related is this:
https://docs.dagster.io/_apidocs/execution#executors
...but it doesn't even mention Dask as an option (it has dagster.in_process_executor and dagster.multiprocess_executor, which don't seem at all related to dask).
Probably I need to configure dagster-dask, which is documented here: https://docs.dagster.io/_apidocs/libraries/dagster-dask#dask-dagster-dask
...but where do I put that run config when using Dagit? There's no way to feed config.yaml to Dagit, for example.
Some options:
you can manually plug in the values that are in config.yaml in to the dagit playground
you can bind the config directly to the executor if you do not need to change it ever https://docs.dagster.io/concepts/configuration/configured#configured-api
you can create a preset from that config yaml https://docs.dagster.io/tutorial/advanced-tutorial/pipelines#pipeline-config-presets
Given the context, I would recommend the configured API

Disable file output of hydra

I'm using hydra to log hyperparameters of experiments.
#hydra.main(config_name="config", config_path="../conf")
def evaluate_experiment(cfg: DictConfig) -> None:
print(OmegaConf.to_yaml(cfg))
...
Sometimes I want to do a dry run to check something. For this I don't need any saved parameters, so I'm wondering how I can disable the savings to the filesystem completely in this case?
The answer from Omry Yadan works well if you want to solve this using the CLI. However, you can also add these flags to your config file such that you don't have to type them every time you run your script. If you want to go this route, make sure you add the following items in your root config file:
defaults:
- _self_
- override hydra/hydra_logging: disabled
- override hydra/job_logging: disabled
hydra:
output_subdir: null
run:
dir: .
There is an enhancement request aimed at Hydra 1.1 to support disabling working directory management.
Working directory management is doing many things:
Creating a working directory for the run
Changing the working directory to the created dir.
There are other related features:
Saving log files
Saving files like config.yaml and hydra.yaml into .hydra in the working directory.
Different features has different ways to disable them:
To prevent the creation of a working directory, you can override hydra.run.dir to ..
To prevent saving the files into .hydra, override hydra.output_subdir to null.
To prevent the creation of logging files, you can disable logging output of hydra/hydra_logging and hydra/job_logging, see this.
A complete example might look like:
$ python foo.py hydra.run.dir=. hydra.output_subdir=null hydra/job_logging=disabled hydra/hydra_logging=disabled
Note that as always you can also override those config values through your config file.

Plone Dexterity Behaviors referenceablebehavior not referenceable?

I am following the tests here: https://github.com/plone/plone.app.referenceablebehavior/blob/master/plone/app/referenceablebehavior/referenceable.txt
I added plone.app.referenceablebehavior to Plone 4.3, created a type TTW and made it referenceable:
Then I created an instance of the type in the site root called "My Referenceable Type Instance", and tried the following in debug mode:
>>> from plone.app.referenceablebehavior.referenceable import IReferenceable
>>> IReferenceable.providedBy(app.Plone['my-referenceable-type-instance'])
False
I would expect the result to be True. Is this a bug, or am I missing something?
[0] My buildout:
[buildout]
extends = https://raw.github.com/pythonpackages/buildout-plone/master/4.3.x-dev
[plone]
eggs +=
plone.app.referenceablebehavior
In a debug session, you need to set the local site manager before attempting this. Try:
>>> from zope.component.hooks import setSite
>>> setSite(app.Plone)
...prior to attempting to check if IReferenceable is provided by the object. The reason that this is necessary is that Dexterity uses something called an Object Specification Descriptor that looks up interfaces dynamically from the Factory Type Information of the type, which is site-specific (you cannot retrieve site-specific configuration without first having the local site configured for lookups).

Resources