Running into troubles with data path (os.path.isdir(path) returning false when it exists) using FB XLM - directory

I'm trying to run an evaluation on FB's Transcoder (https://github.com/facebookresearch/TransCoder) which implements FB's XLM (cross language model): https://github.com/facebookresearch/XLM#iii-applications-supervised--unsupervised-mt
I have everything configured and follow the instructions as in the github page where I run the validation/test set unzip them and place them into a specific directory: /home/username/TransCoder/data/bpe.cpp-java-python.with_comments/XLM-cpp-java-python-with-comments-functions-sa-cl-split-test.
When I run the following command that is in the run an evaluation section:
python XLM/train.py
--n_heads 8
--bt_steps 'python_sa-cpp_sa-python_sa,cpp_sa-python_sa-cpp_sa,java_sa-cpp_sa-java_sa,cpp_sa-java_sa-cpp_sa,python_sa-java_sa-python_sa,java_sa-python_sa-java_sa' # The evaluator will use this parameter to infer the languages to test on
--max_vocab '-1'
--word_blank '0.1'
--n_layers 6
--generate_hypothesis true
--max_len 512
--bptt 256
--fp16 true
--share_inout_emb true
--tokens_per_batch 6000
--has_sentences_ids true
--eval_bleu true
--split_data false
--data_path '/home/salzubi/TransCoder/data/bpe.cpp-java-python.with_comments/XLM-cpp-java-python-with-comments-functions-sa-cl-split-test'
--eval_computation true
--batch_size 32
--reload_model 'model_1.pth,model_1.pth'
--amp 2
--max_batch_size 128
--ae_steps 'cpp_sa,python_sa,java_sa'
--emb_dim 1024
--eval_only True
--beam_size 10
--retry_mistmatching_types 1
--dump_path '/tmp/'
--exp_name='eval_final_model_wc_30'
--lgs 'cpp_sa-java_sa-python_sa'
--encoder_only=False
I get the following error:
Traceback (most recent call last):
File "XLM/train.py", line 337, in <module>
check_data_params(params)
File "/home/username/TransCoder/XLM/src/data/loader.py", line 246, in check_data_params
assert os.path.isdir(params.data_path), params.data_path
AssertionError
I'm not sure why this is the case?
$ python
import os
print(os.path.isdir("/home/username/TransCoder/data/bpe.cpp-java-python.with_comments/XLM-cpp-java-python-with-comments-functions-sa-cl-split-test"))
output:
True
Any ideas on how to fix this?

Related

Calling Julia from Streamlit App using PyJulia

I'm trying to use a julia function from a streamlit app. Created a toy example to test the interaction, simply returning a matrix from a julia functions based on a single parameter to specify the value of the diagonal elements.
Will also note at the outset that both julia_import_method = "api_compiled_false" and julia_import_method = "main_include" works when importing the function in Spyder IDE (rather than at the command line to launch the streamlit app via streamlit run streamlit_julia_test.py).
My project directory looks like:
├── my_project_directory
│   ├── julia_test.jl
│   └── streamlit_julia_test.py
The julia function is in julia_test.jl and just simply returns a matrix with diagonals specified by the v parameter:
function get_matrix_from_julia(v::Int)
m=[v 0
0 v]
return m
end
The streamlit app is streamlit_julia_test.py and is defined as:
import os
from io import BytesIO
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import streamlit as st
# options:
# main_include
# api_compiled_false
# dont_import_julia
julia_import_method = "api_compiled_false"
if julia_import_method == "main_include":
# works in Spyder IDE
import julia
from julia import Main
Main.include("julia_test.jl")
elif julia_import_method == "api_compiled_false":
# works in Spyder IDE
from julia.api import Julia
jl = Julia(compiled_modules=False)
this_dir = os.getcwd()
julia_test_path = """include(\""""+ this_dir + """/julia_test.jl\"""" +")"""
print(julia_test_path)
jl.eval(julia_test_path)
get_matrix_from_julia = jl.eval("get_matrix_from_julia")
elif julia_import_method == "dont_import_julia":
print("Not importing ")
else:
ValueError("Not handling this case:" + julia_import_method)
st.header('Using Julia in Streamlit App Example')
st.text("Using Method:" + julia_import_method)
matrix_element = st.selectbox('Set Matrix Diagonal to:', [1,2,3])
matrix_numpy = np.array([[matrix_element,0],[0,matrix_element]])
col1, col2 = st.columns([4,4])
with col1:
fig, ax = plt.subplots(figsize=(5,5))
sns.heatmap(matrix_numpy, ax = ax, cmap="Blues",annot=True)
ax.set_title('Matrix Using Python Numpy')
buf = BytesIO()
fig.savefig(buf, format="png")
st.image(buf)
with col2:
if julia_import_method == "dont_import_julia":
matrix_julia = matrix_numpy
else:
matrix_julia = get_matrix_from_julia(matrix_element)
fig, ax = plt.subplots(figsize=(5,5))
sns.heatmap(matrix_julia, ax = ax, cmap="Blues",annot=True)
ax.set_title('Matrix from External Julia Script')
buf = BytesIO()
fig.savefig(buf, format="png")
st.image(buf)
If the app were working correctly, it would look like this (which can be reproduced by setting the julia_import_method = "dont_import_julia" on line 13):
Testing
When I try julia_import_method = "main_include", I get the well known error:
julia.core.UnsupportedPythonError: It seems your Julia and PyJulia setup are not supported.
Julia executable:
julia
Python interpreter and libpython used by PyCall.jl:
/Users/myusername/opt/anaconda3/bin/python3
/Users/myusername/opt/anaconda3/lib/libpython3.9.dylib
Python interpreter used to import PyJulia and its libpython.
/Users/myusername/opt/anaconda3/bin/python
/Users/myusername/opt/anaconda3/lib/libpython3.9.dylib
Your Python interpreter "/Users/myusername/opt/anaconda3/bin/python"
is statically linked to libpython. Currently, PyJulia does not fully
support such Python interpreter.
The easiest workaround is to pass `compiled_modules=False` to `Julia`
constructor. To do so, first *reboot* your Python REPL (if this happened
inside an interactive session) and then evaluate:
>>> from julia.api import Julia
>>> jl = Julia(compiled_modules=False)
Another workaround is to run your Python script with `python-jl`
command bundled in PyJulia. You can simply do:
$ python-jl PATH/TO/YOUR/SCRIPT.py
See `python-jl --help` for more information.
For more information, see:
https://pyjulia.readthedocs.io/en/latest/troubleshooting.html
As suggested, when I set julia_import_method = "api_compiled_false", I get a seg fault:
include("my_project_directory/julia_test.jl")
2022-04-03 10:23:13.406 Traceback (most recent call last):
File "/Users/myusername/opt/anaconda3/lib/python3.9/site-packages/streamlit/script_runner.py", line 430, in _run_script
exec(code, module.__dict__)
File "my_project_directory/streamlit_julia_test.py", line 25, in <module>
jl = Julia(compiled_modules=False)
File "/Users/myusername/.local/lib/python3.9/site-packages/julia/core.py", line 502, in __init__
if not self.api.was_initialized: # = jl_is_initialized()
File "/Users/myusername/.local/lib/python3.9/site-packages/julia/libjulia.py", line 114, in __getattr__
return getattr(self.libjulia, name)
File "/Users/myusername/opt/anaconda3/lib/python3.9/ctypes/__init__.py", line 395, in __getattr__
func = self.__getitem__(name)
File "/Users/myuserame/opt/anaconda3/lib/python3.9/ctypes/__init__.py", line 400, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: dlsym(0x21b8e5840, was_initialized): symbol not found
signal (11): Segmentation fault: 11
in expression starting at none:0
Allocations: 35278345 (Pool: 35267101; Big: 11244); GC: 36
zsh: segmentation fault streamlit run streamlit_julia_test.py
I've also tried the alternative recommendation provided in the PyJulia response message regarding the use of:
python-jl my_project_directory/streamlit_julia_test.py
But I get this error when running the python-jl command:
INTEL MKL ERROR: dlopen(/Users/myusername/opt/anaconda3/lib/libmkl_intel_thread.1.dylib, 0x0009): Library not loaded: #rpath/libiomp5.dylib
Referenced from: /Users/myusername/opt/anaconda3/lib/libmkl_intel_thread.1.dylib
Reason: tried: '/Applications/Julia-1.7.app/Contents/Resources/julia/bin/../lib/libiomp5.dylib' (no such file), '/usr/local/lib/libiomp5.dylib' (no such file), '/usr/lib/libiomp5.dylib' (no such file).
Intel MKL FATAL ERROR: Cannot load libmkl_intel_thread.1.dylib.
So I'm stuck, thanks in advance for a modified reproducible example or instructions for the following system specs!
System specs:
Mac OS Monterey 12.2.1 (Chip - Mac M1 Pro)
Python 3.9.7
Julia 1.7.2
PyJulia 0.5.8.dev
Streamlit 1.7.0
yes we can use it streamlit_julia_test.py NumPy for instance

Is it possible to run a custom OpenAI gym environment entirely from within Jupyter Notebook

Long story short: I have been given some Python code for a custom openAI gym environment. I can successfully run the code via ExperimentGrid from the command line but would like to be able to run the entire experiment from within Jupyter notebook, rather than calling scripts. This would be more convenient for some experiments that I will be doing farther down the road.
My question: Is it possible to execute an experiment on a custom OpenAI gym environment entirely from within Jupyter Notebook and if so, how? I've seen plenty of examples of people executing gym's standard environments (like SpaceInvaders-v0 or CartPole-v0) from Jupyter but even then, they are calling the environment with
env=gym.make('SpaceInvaders-v0')
and essentially executing that environment's script behind the scenes.
Below is a basic description of how my code is set-up to run from the command line and the errors that I'm getting in Jupyter.
Any advice would be appreciated. I am admittedly rather new to Gym, Python and Linux.
My basic environment code is structured like this in, say, envs/mygames/Custom_Env.py:
various import statements (numpy, gym, pyglet, copy)
class Entity()
class State()
class The_Custom_Env(core.Env) # This is the main environment class
class Shell_Class # This class calls The_Custom_Env and provides some arguments
In mygames/__ init__.py,I import the Shell_Class:
from gym.envs.mygames.Custom_Env import Shell_Class
In envs/__ init__.py, I have the environment registered
register(
id='TEST-v0',
entry_point='gym.envs.mygames:Shell_Class',
max_episode_steps=200,
reward_threshold=25.0,)
Finally, if I execute a script containing this code from the command line, the experiment works without issue:
from spinup.utils.run_utils import ExperimentGrid
from spinup import ppo_pytorch
import torch
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--cpu', type=int, default=4)
parser.add_argument('--num_runs', type=int, default=1)
args = parser.parse_args()
eg = ExperimentGrid(name='super-cool-test')
eg.add('env_name', 'TEST-v0', '', True)
eg.add('seed', [10*i for i in range(args.num_runs)])
eg.add('epochs', [10])
eg.add('steps_per_epoch', 4000)
eg.add('ac_kwargs:hidden_sizes', [(32, 32)], 'hid')
eg.add('ac_kwargs:activation', [torch.nn.ReLU], '')
eg.add('pi_lr', [0.001])
eg.add('clip_ratio', 0.3)
eg.run(ppo_pytorch, num_cpu=args.cpu)
My Jupyter Attempt
I put all of the code from Custom_env.py in cell #1.
I then registered the environment in cell #2:
gym.register(
id='TEST-v1',
entry_point='__main__:Shell_Class',
max_episode_steps=200,
reward_threshold=25.0,)
based on this Q/A: Register gym environment that is defined inside a jupyter notebook cell
, I make the environment in cell #3:
gym.make('TEST-v1')
and get this non-descriptive output:
<TimeLimit<Shell_Class< TEST-v1 >>>
In cell #4, I tried to execute ExperimentGrid code directly within Jupyter like so:
from spinup.utils.run_utils import ExperimentGrid
from spinup import ppo_pytorch
import torch
num_runs=1
cpu=4
env_name='TEST-v1'
eg = ExperimentGrid(name='Jupyter-test')
eg.add('env_name', env_name, '', True)
eg.add('seed', [10*i for i in range(num_runs)])
eg.add('epochs', 500)
eg.add('steps_per_epoch', 4000)
eg.add('ac_kwargs:hidden_sizes', [(32, 32)], 'hid')
eg.add('ac_kwargs:activation', [torch.nn.ReLU], '')
eg.add('pi_lr', 0.001)
eg.add('clip_ratio', 0.3)
eg.run(ppo_pytorch, num_cpu=cpu)
The experiment starts up as usual but then runs into some kind of error:
> ================================================================================
ExperimentGrid [Jupyter-test] runs over parameters:
env_name []
TEST-v1
seed [see]
0
epochs [epo]
500
steps_per_epoch [ste]
4000
ac_kwargs:hidden_sizes [hid]
(32, 32)
ac_kwargs:activation []
ReLU
pi_lr [pi]
0.001
clip_ratio [cli]
0.3
Variants, counting seeds: 1
Variants, not counting seeds: 1
================================================================================
Preparing to run the following experiments...
Jupyter-test_test-v1
================================================================================
Launch delayed to give you a few seconds to review your experiments.
To customize or disable this behavior, change WAIT_BEFORE_LAUNCH in
spinup/user_config.py.
================================================================================
Running experiment:
Jupyter-test_test-v1
with kwargs:
{
"ac_kwargs": {
"activation": "ReLU",
"hidden_sizes": [
32,
32
]
},
"clip_ratio": 0.3,
"env_name": "TEST-v1",
"epochs": 500,
"pi_lr": 0.001,
"seed": 0,
"steps_per_epoch": 4000
}
================================================================================
There appears to have been an error in your experiment.
Check the traceback above to see what actually went wrong. The
traceback below, included for completeness (but probably not useful
for diagnosing the error), shows the stack leading up to the
experiment launch.
================================================================================
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-14-de843fd528cf> in <module>
15 eg.add('pi_lr', 0.001)
16 eg.add('clip_ratio', 0.3)
---> 17 eg.run(ppo_pytorch, num_cpu=cpu)
~/Downloads/spinningup/spinup/utils/run_utils.py in run(self, thunk, num_cpu, data_dir, datestamp)
544
545 call_experiment(exp_name, thunk_, num_cpu=num_cpu,
--> 546 data_dir=data_dir, datestamp=datestamp, **var)
547
548
~/Downloads/spinningup/spinup/utils/run_utils.py in call_experiment(exp_name, thunk, seed, num_cpu, data_dir, datestamp, **kwargs)
169 cmd = [sys.executable if sys.executable else 'python', entrypoint, encoded_thunk]
170 try:
--> 171 subprocess.check_call(cmd, env=os.environ)
172 except CalledProcessError:
173 err_msg = '\n'*3 + '='*DIV_LINE_WIDTH + '\n' + dedent("""
~/anaconda3/envs/spinningup/lib/python3.6/subprocess.py in check_call(*popenargs, **kwargs)
309 if cmd is None:
310 cmd = popenargs[0]
--> 311 raise CalledProcessError(retcode, cmd)
312 return 0
313

GCP Composer/Airflow calling Dataflow/beam throwing error

I have a GCP Cloud Composer environment with airflow version composer-1.10.0-airflow-1.10.6 and python 3, 3.6 to be precise. I am calling an apache-beam pipeline on Dataflow using a python_operator.PythonOperator, operator. Here is the code snippet
Calling the pipeline function
test_greeting = python_operator.PythonOperator(
task_id='python_pipeline',
python_callable=run_pipeline
)
The pipeline function is as follows
def run_pipeline():
print("Test Pipeline")
pipeline_args=[
"--runner","DataflowRunner",
"--project","*****",
"--temp_location","gs://******/temp",
"--region","us-east1",
"--job_name","job1199",
"--zone","us-east1-b"
]
pipeline_options=PipelineOptions(pipeline_args)
pipe=beam.Pipeline(options=pipeline_options)
small_sum = (
pipe
| beam.Create([18,5,7,7,9,23,13,5])
| "Combine Globally" >> beam.CombineGlobally(AverageFn())
| 'Write results' >> beam.io.WriteToText('gs://******/ouptut_from_pipline/combine')
)
run_result=pipe.run()
run_result.wait_until_finish()
return "True"
When I run this the pipeline execution runs in dataflow but fails with the following error
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line 648, in do_work
work_executor.execute()
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", line 150, in execute
test_shuffle_sink=self._test_shuffle_sink)
File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", line 116, in create_operation
is_streaming=False)
File "apache_beam/runners/worker/operations.py", line 1032, in apache_beam.runners.worker.operations.create_operation
File "apache_beam/runners/worker/operations.py", line 845, in apache_beam.runners.worker.operations.create_pgbk_op
File "apache_beam/runners/worker/operations.py", line 903, in apache_beam.runners.worker.operations.PGBKCVOperation.__init__
File "/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line 290, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 275, in loads
return load(file, ignore, **kwds)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 270, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 472, in load
obj = StockUnpickler.load(self)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 462, in find_class
return StockUnpickler.find_class(self, module, name)
ModuleNotFoundError: No module named 'unusual_prefix_162ac8b7030d5bd1ff5f128a26483932d3968a4d_python_bash'
The beam version is Apache Beam Python 3.6 SDK 2.19.0.
I suspect the version of Python 3.6 may be the issue as calling the pipeline directly (as a runner) from my local system works fine and my local system is running python 3.7.
I cant find a way to test this theory though.
It would be helpful to get tips of how to resolve this issue.

How can I play a .wav file in python 3.6?

Currently, I'm making a program using python 3.6, and I've been trying to make a sound effect occur after pressing a button. I found something from a different question asked, found here. Down below is my code.
from tkinter import *
window = Tk()
c = Canvas(window, height=100, width=100, bg='blue')
c.pack()
with open('Users/lenonvo/AppData/Local/Programs/Python/Python 3.6/Python 3
Files/Python [3.6.3]/Sounds/Blook Game/Attack.wav','rb') as f:
data = base64.b64encode(f.read())
def playSound():
sound = winsound.PlaySound(base64.b64decode(data), winsound.SND_MEMORY)
def sound(event):
global sound
if event.keysym == 'Up':
playSound()
c.pack('<Key>', sound)
Then I got this error message:
RESTART: C:\Users\lenonvo\AppData\Local\Programs\Python\Python36\Python 3
Files\Python 3.6.3\Test2.py
Traceback (most recent call last):
File "C:\Users\lenonvo\AppData\Local\Programs\Python\Python36\Python 3
Files\Python 3.6.3\Test2.py", line 7, in <module>
with open('Users/lenonvo/AppData/Local/Programs/Python/Python 3.6/Python
3 Files/Python [3.6.3]/Sounds/Blook Game/Attack.wav','rb') as f:
FileNotFoundError: [Errno 2] No such file or directory:
'Users/lenonvo/AppData/Local/Programs/Python/Python 3.6/Python 3
Files/Python [3.6.3]/Sounds/Blook Game/Attack.wav'
Any suggestions?
Also, I am not an advanced python coder, so if your answer could be simplistic, it would be very much appreciated. =]
Now one problem is fixed, thanks to Patrick Artner, but now this error message comes up:
data = base64.b64decode(f.read())
NameError: name 'base64' is not defined
This:
with open('Users/lenonvo/AppData/Local/Programs/Python/Python 3.6/Python 3 Files/Python [3.6.3]/Sounds/Blook Game/Attack.wav','rb') as f:
is a relative path to where your prog runs.
Use
with open('C:/Users/lenonvo/AppData/Local/Programs/Python/Python 3.6/Python 3 Files/Python [3.6.3]/Sounds/Blook Game/Attack.wav','rb') as f:
^^^^

pexpect python throw error

Although this is my first attempt at using pexpect, the python3 script using pexpect is pretty simple; yet it fails.
#!/usr/bin/env python3
import sys
import pexpect
SSH_NEWKEY = r'Are you sure you want to continue connecting \(yes/no\)\?'
child = pexpect.spawn("ssh -i /user/aws/key.pem ec2-user#xxx.xxx.xxx.xxx date")
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY )
if i == 1:
child.sendline('yes')
print(child.before)
The SSH_NEWKEY is the only response I'm expecting, but the example showed a list containing pexpect.TIMEOUT in it so I used it.
$ ./test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 144, in read_nonblocking
s = os.read(self.child_fd, size)
OSError: [Errno 5] Input/output error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 97, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/pty_spawn.py", line 455, in read_nonblocking
return super(spawn, self).read_nonblocking(size)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 149, in read_nonblocking
raise EOF('End Of File (EOF). Exception style platform.')
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./min.py", line 15, in <module>
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY ] )
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 315, in expect
timeout, searchwindowsize, async)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 339, in expect_list
return exp.expect_loop(timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 102, in expect_loop
return self.eof(e)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 49, in eof
raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7f70ea4fbcf8>
command: /usr/bin/ssh
args: ['/usr/bin/ssh', '-i', '/user/aws/key.pem', 'ec2-user#xxx.xxx.xxx.xxx', 'date']
searcher: None
buffer (last 100 chars): b''
before (last 100 chars): b'Fri May 6 13:50:18 EDT 2016\r\n'
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: 0
flag_eof: True
pid: 31293
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
What am I missing?
CentOS 6.4
python 3.4.3
An EOF error is being raised during your expect call. This means that the response received does not match SSH_NEWKEY, and reaches end of file within the timeout period. To catch this exception, you should change your except line to read:
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY, pexpect.EOF)
You can then make your if more robust:
if i == 1:
child.sendline('yes')
elif i == 0:
print "Timeout"
elif i == 2:
print "EOF"
print(child.before)
This doesn't solve the reason behind why you are on receiving a response with the expected string - it's hard to know without looking at more code but it's likely because you have the response slightly wrong. If you manually type in the SSH string, you should be able to see the response you can expect, and enter this response into your code.
You can also print child.before after your expect call, or print child.read() instead of your expect call to see what is being sent back as a response.

Resources