Trying to install splash in R, without luck - r

EDIT: I followed Alistaire's advice below, and installed docker and numpy with pip.
However, this yields a new error, when running install_splash() in R:
> install_splash()
Error in py_call_impl(callable, dots$args, dots$keywords) :
AttributeError: module 'os' has no attribute 'errno'
3. stop(structure(list(message = "AttributeError: module 'os' has no
attribute 'errno'",
call = py_call_impl(callable, dots$args, dots$keywords),
cppstack = structure(list(file = "", line = -1L, stack = "C++ stack not available on this system"), class =
"Rcpp_stack_trace")), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
2.client$api$pull("scrapinghub/splash", tag)
1.install_splash()
When I rerun install_splash() with debug, I get the following message:
Detailed traceback:
File "C:\Python37\lib\site-packages\docker\api\image.py", line 380, in pull
header = auth.get_config_header(self, registry)
File "C:\Python37\lib\site-packages\docker\auth.py", line 48, in get_config_header
client._auth_configs, registry, credstore_env=client.credstore_env
File "C:\Python37\lib\site-packages\docker\auth.py", line 96, in resolve_authconfig
authconfig, registry, store_name, env=credstore_env
File "C:\Python37\lib\site-packages\docker\auth.py", line 129, in _resolve_authconfig_credstore
data = store.get(registry)
File "C:\Python37\lib\site-packages\dockerpycreds\store.py", line 35, in get
data = self._execute('get', server)
File "C:\Python37\lib\site-packages\dockerpycreds\store.py", line 89, in _execute
if e.errno == os.errno.ENOENT:
What gives? I know Splash is up and running, and I've confirmed the correct installation of the docker and numpy modules with pip freeze.
Original post:
I am trying scrape tables off of a number of websites in R. To do this, I've been recommended to use Splash through Docker. I've downloaded Docker, and managed to get it up and running. Additionally, I've installed Python 3.5. I have pulled the Splash image through the command:
docker pull scrapinghub/splash
and started the container by the command:
docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
I've checked to see if Splash is indeed running properly, by checking 'http://localhost:8050/' on my browser - it works.
In R, I've run the following command:
> splash_active()
Which is returns this:
Status of splash instance on [http://localhost:8050]: ok. Max RSS: 73 Mb
[1] TRUE
So far so good. Now, I am trying to install Splash in R, by the command:
install_splash()
But R returns an error, saying:
Error: Python module docker was not found.
Detected Python configuration:
python: C:\Users\Lucas\AppData\Local\Programs\Python\Python37\\python.exe
libpython: C:/Users/Lucas/AppData/Local/Programs/Python/Python37/python37.dll
pythonhome: C:\Users\Lucas\AppData\Local\Programs\Python\Python37
version: 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)]
Architecture: 64bit
numpy: [NOT FOUND]
docker: [NOT FOUND]
What could be the issue? Is it related to 'numpy' and 'docker' not being found?

Related

Why is blogdown putting a mamba command through normalizePath?

Here's what I'm doing:
I have a blog that uses blogdown to render .Rmd files.
Some of the code snippets in the blog are in Python. I'm using reticulate for that.
I'm using a GitHub workflow to build and publish the blog as part of a larger website. This workflow sets up the environment and package dependencies in miniconda.
The last time this ran was six months ago. At that time, it worked. Now, it does not. I can't seem to replicate the behavior locally for more detailed debugging.
It seems to be trying to put a mamba command into normalizePath instead of a filesystem path (www-main is the name of the repository):
conda activate www-main
Rscript -e 'blogdown::build_site(local=FALSE, run_hugo=FALSE, build_rmd="content/blog/2020-08-28-api.Rmd")'
shell: /usr/bin/bash -l {0}
env:
CONDA_PKGS_DIR: /home/runner/conda_pkgs_dir
Rendering content/blog/2020-08-28-api.Rmd...
[...]
Quitting from lines 401-410 (2020-08-28-api.Rmd)
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) :
path[1]="# cmd: /usr/share/miniconda/condabin/mamba update --name www-main --file /home/runner/work/www-main/www-main/conda": No such file or directory
Calls: local ... python_munge_path -> get_python_conda_info -> normalizePath
Execution halted
Error: Failed to render content/blog/2020-08-28-api.Rmd
Execution halted
Lines 401-410 of 2020-08-28-api.Rmd are a Python code block:
400 ```{python python-data, dev='svg'}
401 import covidcast
402 from datetime import date
403 import matplotlib.pyplot as plt
404
405 data = covidcast.signal("fb-survey", "smoothed_hh_cmnty_cli",
406 date(2020, 9, 8), date(2020, 9, 8),
407 geo_type="state")
408 covidcast.plot_choropleth(data, figsize=(7, 5))
409 plt.title("% who know someone who is sick, Sept 8, 2020")
410 ```
The useful bits of the output of conda info, in case it helps:
active environment : www-main
active env location : /usr/share/miniconda/envs/www-main
shell level : 1
user config file : /home/runner/.condarc
populated config files : /home/runner/.condarc
conda version : 4.12.0
conda-build version : not installed
python version : 3.9.12.final.0
virtual packages : __linux=5.15.0=0
__glibc=2.31=0
__unix=0=0
__archspec=1=x86_64
base environment : /usr/share/miniconda (writable)
conda av data dir : /usr/share/miniconda/etc/conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /home/runner/conda_pkgs_dir
envs directories : /usr/share/miniconda/envs
/home/runner/.conda/envs
platform : linux-64
user-agent : conda/4.12.0 requests/2.27.1 CPython/3.9.12 Linux/5.15.0-1020-azure ubuntu/20.04.5 glibc/2.31
UID:GID : 1001:121
netrc file : None
offline mode : False
I found this, but their workaround doesn't make sense for me since I'm not using papermill: https://github.com/rstudio/reticulate/issues/1184
I found this, but my paths don't have spaces: https://github.com/rstudio/reticulate/issues/1149
I found this, but their problem includes an entirely reasonable value for path[1], unlike mine: How can I tell R where the conda environment is via a docker image?
The build environment for this is a bit of a bear but I can probably put together a minimum working (/nonworking) example if needed, lmk
I tracked this down to at least two bits of weird/buggy behavior in reticulate and found a workaround: switch from vanilla miniconda to Mambaforge.
The TL;DR seems to be that whatever wacky ubuntu-latest setup-miniconda#v2 environment started putting into meta/history doesn't include a create line, which is what reticulate needs in order to figure out which conda goes with which python, because (1) it ignores the reticulate.conda_binary setting for some reason, and (2) it uses a more restrictive regex to parse the lines of the history file than the regex it uses to select them. Mambaforge does include the create line, so reticulate is happy.
- uses: conda-incubator/setup-miniconda#v2
with:
python-version: 3.9
activate-environment: www-main
miniforge-variant: Mambaforge
miniforge-version: latest
use-mamba: true
use-only-tar-bz2: true # (for caching support)
- name: Update environment
run: mamba env update -n www-main -f environment.yml

Plugin error from dada2: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more

I have downloaded QIIME2 2022.2Version from conda environment using the commands given in the qiime docs.
I have downloaded the data from NCBI SRA and imported in QIIME2 with the help of following commands
Use the manifest file to import the sequences into QIIME 2:
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path SRA/fastq/manifest/manifest.tsv \
--output-path demux.qza \
--input-format PairedEndFastqManifestPhred33V2
I visualized the demux.qza file using-
qiime demux summarize --i-data demux.qza --o-visualization demux.qzv
The sequences are already demultiplexed and no apadpters were present but when i tried to create ASVs using DADA2 . I got the following error
-qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 220
--p-trunc-len-r 190
--p-n-threads 0
--output-dir work/DADA2_denoising_output
--verbose
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: run_dada_paired.R /tmp/tmpennq2f9g/forward /tmp/tmpennq2f9g/reverse /tmp/tmpennq2f9g/output.tsv.biom /tmp/tmpennq2f9g/track.tsv /tmp/tmpennq2f9g/filt_f /tmp/tmpennq2f9g/filt_r 220 190 0 0 2.0 2.0 2 12 independent consensus 1.0 0 1000000
R version 4.1.2 (2021-11-01)
Loading required package: Rcpp
DADA2: 1.22.0 / Rcpp: 1.0.8 / RcppParallel: 5.1.5
Filtering Error in x$yield(...) :
_DNAencode(): invalid DNAString input character: '' (byte value 4)
Execution halted
Traceback (most recent call last):
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 279, in denoise_paired
run_commands([cmd])
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmpennq2f9g/forward', '/tmp/tmpennq2f9g/reverse', '/tmp/tmpennq2f9g/output.tsv.biom', '/tmp/tmpennq2f9g/track.tsv', '/tmp/tmpennq2f9g/filt_f', '/tmp/tmpennq2f9g/filt_r', '220', '190', '0', '0', '2.0', '2.0', '2', '12', 'independent', 'consensus', '1.0', '0', '1000000']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2cli/commands.py", line 339, in call
results = action(**arguments)
File "", line 2, in denoise_paired
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in callable_executor
output_views = self._callable(**view_args)
File "/home/pragya/anaconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 292, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
Plugin error from dada2:
An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
See above for debug info.
I have looked for all the possible options but nothing worked .Can somebody help me with this.
Thanks in advanceenter image description here

Using the R package reticulate within a Singularity container

reticulate is an R package to call Python code from R, that I find easy to use on my local computer and I sometimes get working within a Singularity container.
In this question I will post a simple reprex about code that worked two months ago,
at https://github.com/richelbilderbeek/reticulate_on_singularity
and I hope you can help to get it to work again.
Here is the part of the Singularity script
(the full script can be found here, so one can confirm I use -for example- a fresh Miniconda install)
that creates the Conda environment in the /opt/ormr folder
(Singularity recommends to use a system folder, hence the opt, ormr is Swedish for snake), which works fine:
%post
# ...
Rscript -e 'reticulate::conda_create(envname = "/opt/ormr")'
Rscript -e 'reticulate::use_condaenv(condaenv = "/opt/ormr")'
Rscript -e 'reticulate::use_python(python = reticulate:::python_binary_path("/opt/ormr"), required = TRUE)'
Rscript -e 'reticulate::conda_install(packages = "scipy", envname = "/opt/ormr")'
Rscript -e 'reticulate:::conda_list_packages(envname = "/opt/ormr")'
However, in the test section of the Singularity script, this goes wrong, probably because the section above (the post section) has full access rights, where test does not have this:
%test
# ...
Rscript -e 'reticulate::use_condaenv(condaenv = "/opt/ormr")'
The -I think- relevant part of the error message is shown below (copied from this GitHub Actions log and full error message is at the bottom of this post):
OSError: [Errno 30] Read-only file system: '/opt/ormr/.tmphx9hr3g2'
`$ /miniconda/bin/conda run --prefix /opt/ormr --no-capture-output python -c import os; print(os.environ['PATH'])`
I would conclude that Conda creates a temporary file in the now-read-only folder /opt/ormr.
How need I fix this problem?
Or, in general, how can use reticulate from within a Singularity container? A working example would be great (I have found none that work)!
Thanks, Richel Bilderbeek
Full error message
# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<
Traceback (most recent call last):
File "/miniconda/lib/python3.9/site-packages/conda/exceptions.py", line 1080, in __call__
return func(*args, **kwargs)
File "/miniconda/lib/python3.9/site-packages/conda/cli/main.py", line 84, in _main
exit_code = do_call(args, p)
File "/miniconda/lib/python3.9/site-packages/conda/cli/conda_argparse.py", line 83, in do_call
return getattr(module, func_name)(args, parser)
File "/miniconda/lib/python3.9/site-packages/conda/cli/main_run.py", line 25, in execute
script_caller, command_args = wrap_subprocess_call(on_win, context.root_prefix, prefix,
File "/miniconda/lib/python3.9/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
File "/miniconda/lib/python3.9/site-packages/conda/auxlib/compat.py", line 88, in Utf8NamedTemporaryFile
return NamedTemporaryFile(
File "/miniconda/lib/python3.9/tempfile.py", line 541, in NamedTemporaryFile
(fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File "/miniconda/lib/python3.9/tempfile.py", line 251, in _mkstemp_inner
fd = _os.open(file, flags, 0o600)
OSError: [Errno 30] Read-only file system: '/opt/ormr/.tmphx9hr3g2'
`$ /miniconda/bin/conda run --prefix /opt/ormr --no-capture-output python -c import os; print(os.environ['PATH'])`
environment variables:
CIO_TEST=<not set>
CONDA_ROOT=/miniconda
CURL_CA_BUNDLE=<not set>
GITHUB_EVENT_PATH=/github/workflow/event.json
GITHUB_PATH=/__w/_temp/_runner_file_commands/add_path_10663761-8f12-43e3-b986-5f70
909c4322
GOPATH=/go
LD_LIBRARY_PATH=/usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-
java/lib/server:/.singularity.d/libs
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
REQUESTS_CA_BUNDLE=<not set>
SSL_CERT_FILE=<not set>
USER_PATH=/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/us
r/bin:/sbin:/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/loc
al/sbin
active environment : None
user config file : /root/.condarc
populated config files :
conda version : 4.11.0
conda-build version : not installed
python version : 3.9.5.final.0
virtual packages : __linux=5.11.0=0
__glibc=2.33=0
__unix=0=0
__archspec=1=x86_64
base environment : /miniconda (read only)
conda av data dir : /miniconda/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /miniconda/pkgs
/root/.conda/pkgs
envs directories : /root/.conda/envs
/miniconda/envs
platform : linux-64
user-agent : conda/4.11.0 requests/2.27.1 CPython/3.9.5 Linux/5.11.0-1027-azure debian/ glibc/2.33
UID:GID : 0:0
netrc file : None
offline mode : False
An unexpected error has occurred. Conda has prepared the above report.
Error in Sys.setenv(PATH = new_path) : wrong length for argument
Calls: <Anonymous> ... use_python -> python_config -> python_munge_path -> Sys.setenv
In addition: Warning message:
In system2(conda, c("run", in_env, run_args, shQuote(cmd), args), :
running command ''/miniconda/bin/conda' run --prefix '/opt/ormr' --no-capture-output 'python' -c "import os; print(os.environ['PATH'])"' had status 1
Execution halted

Why is Globus Connect Personal not working?

I am trying to install and configure Globus Connect Personal for Linux (i have a CentOS 8), following this tutorial. However, when I try to set up Globus connect personal by running ./globusconnectpersonal -start i get this error
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = 'gc.py'
isolated = 0
environment = 1
user site = 1
import site = 1
sys._base_executable = ''
sys.base_prefix = '/tmp/build/80754af9/python_1599203911753/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho'
sys.base_exec_prefix = '/tmp/build/80754af9/python_1599203911753/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho'
sys.executable = ''
sys.prefix = '/tmp/build/80754af9/python_1599203911753/_h_env_placehold_
Subprocess pid 1722896 exited, rc=1
Traceback (most recent call last):
File "./gc-ctrl.py", line 369, in <module>
start(debug=False)
File "./gc-ctrl.py", line 191, in start
send2clients(fds[2:], mesg.encode('utf-8'))
AttributeError: 'bytes' object has no attribute 'encode'
does anybody know what this could mean?
I think there needs to be PYTHONHOME and PYTHONPATH. I created a conda environment with just the correct version of python in it. Then I ran ./globusconnectpersonal inside the conda environment.
Using a conda environment also works for the non-GUI form of globus.
I have not tried setting the paths manually.
I ran into the same problem when I was using Python3.8 in a Miniconda environment. When I disabled conda with:
conda deactivate
Then I could run "globusconnectpersonal -start" with my native Python2.7. I don't know if it was because the client needed Python2 or if conda was interfering, but his resolved the problem for me.

Jupyter notebook : Server Error while accessing notebooks

I'm facing the following error while trying to upload or access any file in my Jupyter notebook. NOTE: I'm using an Ubuntu 18.04 VM and Jupyter Notebook 5.7.4.
Server error: Traceback (most recent call last): File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/tornado/web.py",
line 1592, in _execute result = yield result File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/tornado/gen.py",
line 1133, in run value = future.result() File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/tornado/gen.py",
line 326, in wrapper yielded = next(result) File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/notebook/services/contents/handlers.py",
line 112, in get path=path, type=type, format=format, content=content,
File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/notebook/services/contents/filemanager.py",
line 431, in get model = self._dir_model(path, content=content) File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/notebook/services/contents/filemanager.py",
line 337, in _dir_model if self.should_list(name) and not
is_file_hidden(os_path, stat_res=st): File
"/home/abhi1507/anaconda3/lib/python3.7/site-packages/notebook/utils.py",
line 146, in is_file_hidden_posix stat_res = os.stat(abs_path)
OSError: [Errno 40] Too many levels of symbolic links:
'/home/abhi1507/q3.sh' AD
Not sure if it is the same thing, but I suddenly had all my jupyter notebooks in new conda environments start breaking today. Eventually found this worked:
pip uninstall tornado
pip install tornado==5.1.1
Jupyter Notebooks and Tornado both released new versions in the past week (circa 2019-03-01) - I think one or both is the problem. If tornado doesn't work, I'd try the same idea but downgrading jupyter (or both).
Edit: Actual cause/fix described here. Can fix the issue by updating nbconvert to v5.4.1:
pip install --upgrade nbconvert

Resources