I have a GCP Cloud Composer environment with airflow version composer-1.10.0-airflow-1.10.6 and python 3, 3.6 to be precise. I am calling an apache-beam pipeline on Dataflow using a python_operator.PythonOperator, operator. Here is the code snippet
Calling the pipeline function
test_greeting = python_operator.PythonOperator(
task_id='python_pipeline',
python_callable=run_pipeline
)
The pipeline function is as follows
def run_pipeline():
print("Test Pipeline")
pipeline_args=[
"--runner","DataflowRunner",
"--project","*****",
"--temp_location","gs://******/temp",
"--region","us-east1",
"--job_name","job1199",
"--zone","us-east1-b"
]
pipeline_options=PipelineOptions(pipeline_args)
pipe=beam.Pipeline(options=pipeline_options)
small_sum = (
pipe
| beam.Create([18,5,7,7,9,23,13,5])
| "Combine Globally" >> beam.CombineGlobally(AverageFn())
| 'Write results' >> beam.io.WriteToText('gs://******/ouptut_from_pipline/combine')
)
run_result=pipe.run()
run_result.wait_until_finish()
return "True"
When I run this the pipeline execution runs in dataflow but fails with the following error
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line 648, in do_work
work_executor.execute()
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", line 150, in execute
test_shuffle_sink=self._test_shuffle_sink)
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", line 116, in create_operation
is_streaming=False)
File "apache_beam/runners/worker/operations.py", line 1032, in apache_beam.runners.worker.operations.create_operation
File "apache_beam/runners/worker/operations.py", line 845, in apache_beam.runners.worker.operations.create_pgbk_op
File "apache_beam/runners/worker/operations.py", line 903, in apache_beam.runners.worker.operations.PGBKCVOperation.__init__
File "/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line 290, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 275, in loads
return load(file, ignore, **kwds)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 270, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 472, in load
obj = StockUnpickler.load(self)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 462, in find_class
return StockUnpickler.find_class(self, module, name)
ModuleNotFoundError: No module named 'unusual_prefix_162ac8b7030d5bd1ff5f128a26483932d3968a4d_python_bash'
The beam version is Apache Beam Python 3.6 SDK 2.19.0.
I suspect the version of Python 3.6 may be the issue as calling the pipeline directly (as a runner) from my local system works fine and my local system is running python 3.7.
I cant find a way to test this theory though.
It would be helpful to get tips of how to resolve this issue.
Related
I am working on the code below, and I got SSHException('No existing session') error
from ncclient import manager
with manager.connect(host="192.168.1.40", port=830, username="cisco", password="cisco", hostkey_verify=False, device_params={'name':'iosxr'}, ssh_config=True, allow_agent=False,look_for_keys=False, timeout=10) as nc_conn:
nc_config = nc_conn.get_config(source='running').data_xml
print (nc_config)
yapi#MacBook-Pro-3 iTel % python test01.py
Traceback (most recent call last):
File "/Users/yapi/Documents/Scripts/Non-GIT/Python/2023/iTel/test01.py", line 4, in
with manager.connect(host="192.168.1.40", port=830, username="abdul", password="cisco", hostkey_verify=False, device_params={'name':'iosxr'}, ssh_config=True, allow_agent=False,look_for_keys=False, timeout=10) as nc_conn:
File "/usr/local/lib/python3.9/site-packages/ncclient/manager.py", line 176, in connect
return connect_ssh(*args, **kwds)
File "/usr/local/lib/python3.9/site-packages/ncclient/manager.py", line 143, in connect_ssh
session.connect(*args, **kwds)
File "/usr/local/lib/python3.9/site-packages/ncclient/transport/ssh.py", line 364, in connect
self._auth(username, password, key_filenames, allow_agent, look_for_keys)
File "/usr/local/lib/python3.9/site-packages/ncclient/transport/ssh.py", line 480, in _auth
raise AuthenticationError(repr(saved_exception))
ncclient.transport.errors.AuthenticationError: SSHException('No existing session')
yapi#MacBook-Pro-3 iTel %
I have tried using different flags in connect function, but no luck
I used dbt init to create a profiles.yml in my .dbt folder. It looks like this:
spring_project:
outputs:
dev:
account: xxx.snowflakecomputing.com
database: PROD_DWH
password: password
role: SYSADMIN
schema: STG
threads: 1
type: snowflake
user: MYUSERNAME
warehouse: DEV_XS_WH
target: dev
Now, I created a new folder on my desktop which only contains a dbt_project.yml file that has this:
profile: 'spring_project'
When I run this from my project folder:
dbt debug --config-dir
I get this:
21:48:59 Running with dbt=1.2.1
21:48:59 To view your profiles.yml file, run:
open /Users/myusername/.dbt
However, when I run dbt
dbt run --profiles-dir /Users/myusername/.dbt
I get this:
21:43:39 Encountered an error while reading the project:
21:43:39 ERROR: Runtime Error
Invalid config version: 1, expected 2
Error encountered in /Users/myusername/Desktop/spring_project/dbt_project.yml
21:43:39 Encountered an error:
Runtime Error
Could not run dbt
21:43:39 Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/dbt/task/base.py", line 108, in from_args
config = cls.ConfigType.from_args(args)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/config/runtime.py", line 226, in from_args
project, profile = cls.collect_parts(args)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/config/runtime.py", line 194, in collect_parts
partial = Project.partial_load(project_root, verify_version=version_check)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/config/project.py", line 639, in partial_load
return PartialProject.from_project_root(
File "/opt/homebrew/lib/python3.10/site-packages/dbt/config/project.py", line 485, in from_project_root
raise DbtProjectError(
dbt.exceptions.DbtProjectError: Runtime Error
Invalid config version: 1, expected 2
Error encountered in /Users/myusername/Desktop/spring_project/dbt_project.yml
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/dbt/main.py", line 129, in main
results, succeeded = handle_and_check(args)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/main.py", line 191, in handle_and_check
task, res = run_from_args(parsed)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/main.py", line 218, in run_from_args
task = parsed.cls.from_args(args=parsed)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/task/base.py", line 185, in from_args
return super().from_args(args)
File "/opt/homebrew/lib/python3.10/site-packages/dbt/task/base.py", line 114, in from_args
raise dbt.exceptions.RuntimeException("Could not run dbt") from exc
dbt.exceptions.RuntimeException: Runtime Error
Could not run dbt
What am I doing wrong?
Most likely the reason is lack of config-version:
dbt.exceptions.DbtProjectError: Runtime Error
Invalid config version: 1, expected 2
config-version
config-version: 2
Specify your dbt_project.yml as using the v2 structure.
Default:
Without this configuration, dbt will assume your dbt_project.yml uses the version 1 syntax, which was deprecated in dbt v0.19.0.
I have a couple of DAGs running, which execute python scripts which are basically copying data from A to B. But one of the dags throws an error - but data is still getting copied, so somehow it does not have influence on the execution of the python script.
The only "special" what is within this dag is, that the python script builds up a connecton to a postgres database using psycopg2~=2.8.5 but not sure if this is somehow the root cause.
I also checked the permissions for the user, which seem to be fine at least in the dags folder.
Is there any specific timeout value I have to adjust in the config?
[2021-05-19 12:53:42,036] {taskinstance.py:1145} ERROR - Bash command failed 255
Traceback (most recent call last):
File "/hereisthepath/venv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
result = task_copy.execute(context=context)
File "/hereisthepath/venv/lib64/python3.6/site-packages/airflow/operators/bash_operator.py", line 134, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
Update: This is the passage of the operator, which fails. I copied the entire function, however the error throws at line 134 ("raise AirflowException("Bash command failed"))
def execute(self, context):
"""
Execute the bash command in a temporary directory
which will be cleaned afterwards
"""
self.log.info("Tmp dir root location: \n %s", gettempdir())
# Prepare env for child process.
env = self.env
if env is None:
env = os.environ.copy()
airflow_context_vars = context_to_airflow_vars(context, in_env_var_format=True)
self.log.debug('Exporting the following env vars:\n%s',
'\n'.join(["{}={}".format(k, v)
for k, v in airflow_context_vars.items()]))
env.update(airflow_context_vars)
self.lineage_data = self.bash_command
with TemporaryDirectory(prefix='airflowtmp') as tmp_dir:
with NamedTemporaryFile(dir=tmp_dir, prefix=self.task_id) as f:
f.write(bytes(self.bash_command, 'utf_8'))
f.flush()
fname = f.name
script_location = os.path.abspath(fname)
self.log.info(
"Temporary script location: %s",
script_location
)
def pre_exec():
# Restore default signal disposition and invoke setsid
for sig in ('SIGPIPE', 'SIGXFZ', 'SIGXFSZ'):
if hasattr(signal, sig):
signal.signal(getattr(signal, sig), signal.SIG_DFL)
os.setsid()
self.log.info("Running command: %s", self.bash_command)
self.sub_process = Popen(
['bash', fname],
stdout=PIPE, stderr=STDOUT,
cwd=tmp_dir, env=env,
preexec_fn=pre_exec)
self.log.info("Output:")
line = ''
for line in iter(self.sub_process.stdout.readline, b''):
line = line.decode(self.output_encoding).rstrip()
self.log.info(line)
self.sub_process.wait()
self.log.info(
"Command exited with return code %s",
self.sub_process.returncode
)
if self.sub_process.returncode:
raise AirflowException("Bash command failed")
if self.xcom_push_flag:
return line
Update2: It really seems, that this behavior is related to the psycopg2: I now tested all other possible error sources and only when I test with the postgres datasource using psycopg2 package, the error occurs. Meanwhile I also upgraded to the most recent version of psycopg2 (2.8.6) but without success.
Maybe this helps for further investigation
I have trained and saved some NER models using
torch.save(model)
I need to load these model files (extension .pt) for evaluation using
torch.load('PATH_TO_MODEL.pt')
And I get the following error: 'BertConfig' object has no attribute 'return_dict'
For the same, I updated my transformer package to the latest one, but the error persists.
This is the stack trace:
Traceback (most recent call last):
File "/home/systematicReviews/train_mtl_3.py", line 523, in <module>
test_loss, test_cr, test_cr_fine = evaluate_i(test_model, optimizer, scheduler, validation_dataloader, args, device)
File "/home/systematicReviews/train_mtl_3.py", line 180, in evaluate_i
e_loss_coarse, e_output, e_labels, e_loss_fine, e_f_output, e_f_labels, mask, e_cumulative_loss = defModel(args, e_input_ids, attention_mask=e_input_mask, P_labels=e_labels, P_f_labels=e_f_labels)
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/systematicReviews/models/mtl/model.py", line 122, in forward
attention_mask = attention_mask
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/transformers/modeling_bert.py", line 784, in forward
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
File "/home/anaconda3/envs/systreviewclassifi/lib/python3.6/site-packages/transformers/configuration_utils.py", line 219, in use_return_dict
return self.return_dict and not self.torchscript
AttributeError: 'BertConfig' object has no attribute 'return_dict'
Here is some more information about my system:
- `transformers` version: 3.1.0
- Platform: Linux-4.4.0-186-generic-x86_64-with-debian-stretch-sid
- Python version: 3.6.9
- PyTorch version (GPU?): 1.3.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
It worked pretty fine until now, but suddenly this bug appears. Any help or hint is appreciated.
Try to save your model with model.save_pretrained(output_dir). Then you can load your model with model = *.from_pretrained(output_dir) where * is the model class (e.g. BertForTokenClassification).
To save model dictionary rather than an entire model is slightly different. Instead of torch.save(model) use torch.save('path_to_the_model/model.pth') and load using torch.load('path_to_the_model/model.pth').
I just started the PyTorch-Tutorial Deep Learning with PyTorch: A 60 Minute Blitz and I should add, that I haven't programmed any python (but other languages like Java) before.
Right now, my Code looks like
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
print("\n-------------------Backpropagation-------------------\n")
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
dataiter = iter(trainloader)
images, labels = dataiter.next()
def imshow(img):
img = img / 2 + 0.5
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
imshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
which should be consistent with the tutorial.
If I execute this, I'll get the following error:
"C:\Program Files\Anaconda3\python.exe" C:/MA/pytorch/deepLearningWithPytorchTutorial/trainingClassifier.py
-------------------Backpropagation-------------------
Files already downloaded and verified
Files already downloaded and verified
-------------------Backpropagation-------------------
Files already downloaded and verified
Files already downloaded and verified
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Program Files\Anaconda3\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Program Files\Anaconda3\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Program Files\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\MA\pytorch\deepLearningWithPytorchTutorial\trainingClassifier.py", line 23, in <module>
dataiter = iter(trainloader)
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 451, in __iter__
return _DataLoaderIter(self)
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 239, in __init__
w.start()
File "C:\Program Files\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "C:/MA/pytorch/deepLearningWithPytorchTutorial/trainingClassifier.py", line 23, in <module>
dataiter = iter(trainloader)
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 451, in __iter__
return _DataLoaderIter(self)
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 239, in __init__
w.start()
File "C:\Program Files\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
Process finished with exit code 1
I already downloaded the *.py and *.ipynb.
Running the *.ipynb with jupyter works fine (but I don't want to programm in the juniper web-interface, I prefer pyCharm) while the *.py in the console (Anaconda prompt and cmd) fails with the same error.
Does anyone know how to fix this?
(I'm using Python 3.6.5 (from Anaconda) and pyCharm, OS: Win10 64-bit)
Thanks!
Bene
Update:
If it is relevant, I just set num_workers=2 to num_workers=0 (both) and then it'll work.. .
Check out the documentation for multiprocessing: programming guidelines for windows. You should wrap all operations in functions and then call them inside an if __name__ == '__main__' clause:
# required imports
def load_datasets(...):
# Code to load the datasets with multiple workers
def train(...):
# Code to train the model
if __name__ == '__main__':
load_datasets()
train()
In short, the the idea here is to wrap the example code inside an if __name__ == '__main__' statement.
Because of different implementation of multiprocessing in Windows, you need to wrap your main code with this block:
if __name__ == '__main__':
For more info, you can check the official PyTorch Windows notes.