error of python module for docker and splashr - r

I have installed docker and image of splash by
docker pull scrapinghub/splash
and started the container by
docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
which its result can be seen in the picture below
but, the problem is that in R after running install_splash() I receive this error:
Error: Python module docker was not found.
Detected Python configuration:
python: C:\Users\m-joudy\AppData\Local\Programs\Python\Python36\\python.exe
libpython: C:/Users/m-joudy/AppData/Local/Programs/Python/Python36/python36.dll
pythonhome: C:\Users\m-joudy\AppData\Local\Programs\Python\Python36
version: 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)]
Architecture: 64bit
numpy: [NOT FOUND]
docker: [NOT FOUND]
python versions found:
C:\Users\m-joudy\AppData\Local\Programs\Python\Python36\\python.exe
C:\Users\m-joudy\AppData\Local\Programs\Python\PYTHON~1\\python.exe

From splashr's README we learn that the docker commands you used and install_splash() are alternatives, i.e. after
docker pull scrapinghub/splash
docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
you should be ready to use
library(splashr)
splash_active()
and be set to use splash via splashr. If you still want to make install_splash() work, then the error message you quoted tells us, that the python modules docker and numpy are missing. How you install python packages depends on your installation, but one poular way is to use pip, i.e.
pip install numpy
pip install docker
Potentially within a virtual environment.

Related

RStudio Server (community) can't find libR.so even though it is in path

Note: I have filed this as an issue, but I'm not sure if it is really a bug or just something I need to resolve about my system configuration.
This seems to be a library that causes many people trouble with RStudio and RStudio Server. Often people can fix the problem by reinstalling the core R libraries with apt or manually copying or linking the libR.so file to a place where RStudio finds it.
In my case, I'm using a Conda instance for my R executable.
My instance was working and stopped after upgrading my ubuntu 22.04 VM. I tried some things to fix the problem but have not succeeded.
System details
RStudio Edition : Server
RStudio Version : 2022.07.2+576 (Spotted Wakerobin) for Ubuntu Bionic
OS Version : Ubuntu 22.04.1 LTS
R Version : 4.1.3 (2022-03-10) -- "One Push-Up"
Describe the problem in detail
I have a GCP VM running ubuntu 22.04 which I use for RSS.
I did a sudo apt update && sudo apt dist-upgrade and a restart. RSS stopped working. I ran sudo rstudio-server verify-installation and received
/usr/lib/rstudio-server/bin/rsession: error while loading shared libraries: libR.so: cannot open shared object file: No such file or directory
I decided to reinstall RSS using:
wget https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2022.07.2-576-amd64.deb
sudo gdebi rstudio-server-2022.07.2-576-amd64.deb
and did not receive any errors during installation. However I received the same error as above with verify-installation. I then tried doing conda update --all -y. This also did not fix the problem.
Here is some useful information:
(base) balter#rstudio:~$ which R
/home/balter/conda/bin/R
(base) balter#rstudio:~$ head -n3 /etc/rstudio/rserver.conf
# Server Configuration File
rsession-which-r=/home/balter/conda/bin/R
(base) balter#rstudio:~$ find . -name "libR.so"
./conda/lib/R/lib/libR.so
./conda/pkgs/r-base-4.1.3-h7880091_3/lib/R/lib/libR.so
(base) balter#rstudio:~$ sudo rstudio-server verify-installation
TTY detected. Printing informational message about logging configuration. Logging configuration loaded from '/etc/rstudio/logging.conf'. Logging to '/var/log/rstudio/rstudio-server/rserver.log'.
/usr/lib/rstudio-server/bin/rsession: error while loading shared libraries: libR.so: cannot open shared object file: No such file or directory

Python packages required to run RStudio Quarto files in VS Code

I have been trying to run RStudio Quarto script in a fresh Ubuntu 20.04 installation but got into some trouble. Some Python packages that are required to run the simple hello.qmd were not there. I was getting these errors:
MoudleNotFoundError: No module named 'nbclient'
and a second error:
ModuleNotFoundError: No module named 'matplotlib_inline'
The first error was due to I had install the nbclient package. My default Python installation is python2.7. Quarto will not run well with Python 2.7; we should try with 3.7+. If your Linux doesn't come with it by default, this can easily be addressed by installing another Python version and configuring multiple Python versions with the help of the command:
sudo update-alternatives --config python
If no Python version shows up, then it means you have first to configure all your installed Python versions. This is very well explained at https://www.rosehosting.com/blog/how-to-install-and-switch-python-versions-on-ubuntu-20-04/
Once you have configured all your Python versions, every time that you run
sudo update-alternatives --config python, you will be prompted to enter the Python version you want as default. If you have a fresh Ubuntu 20.04, most likely that you have two: Python 2.7 and Python 3.8. Select 3.8 and you will fine. Quarto won't work with Python 2.7.
After you have python3 running and switched into, install nbclient with:
pip install nbclient.
The first error will now pass, but most likely you will get now
ModuleNotFoundError: No module named 'matplotlib_inline'. This is because you also need to install the package matplotlib-inline. This is not documented in the installation instructions of Quarto. But easy to fix. Run:
pip install matplotlib-inline
Now, go back to your VS Code, open the command palette and run Quarto: Render, or just type from the terminal:
quarto preview hello.qmd --no-browser --no-watch-inputs
You are done!

Rstudio not starting on Ubuntu 22.04 - error while loading shared library: libcrypto.so.1.1

There's a libcrypto.so.1 in the system, I created a symlink to it as libcrypto.so.1.1, but it does not find it.
I'm launching rstudio via the binary download: https://download1.rstudio.org/desktop/bionic/amd64/rstudio-2022.02.2-485-amd64-debian.tar.gz
This fixes it (a problem with packaging in 22.04):
wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1l-1ubuntu1.3_amd64.deb
sudo dpkg -i libssl1.1_1.1.1l-1ubuntu1.3_amd64.deb

When installing airflow, no files are created in the airflow_home folder

I have successfully installed it in centos7 in VMware before.
However, in the same way, there was a problem installing manually from centos7 in docker.
(The official build of CentOS.)
(venv) [jykim#0f0090962efa dev]$ cat /etc/*release*
CentOS Linux release 7.9.2009 (Core)
When airflow was installed with the command below, no files were created in the specified AIRFLOW_HOME directory.
pip3.8 install 'apache-airflow[postgres]'
Naturally, we registered AIRFLOW_HOME with .bashrc and confirmed it was working fine.
(venv) [jykim#0f0090962efa ~]$ pwd
/home/jykim
(venv) [jykim#0f0090962efa ~]$ cd $AIRFLOW_HOME/
(venv) [jykim#0f0090962efa airflow_home]$ pwd
/home/jykim/dev/airflow/airflow_home
Reinstalling python resulted in the same result.
This blew the day away. I need your help!
(venv) [jykim#0f0090962efa airflow_home]$ python -V
Python 3.8.8
(venv) [jykim#0f0090962efa airflow_home]$ pip show apache-airflow
Name: apache-airflow
Version: 2.0.2
Installing the Airflow package will not create configuration files in the Airflow home directory. Run Airflow once for it to create the default configuration files, e.g. with:
airflow info

ERROR: flask 1.1.2 has requirement Werkzeug>=0.15

I am trying to install Airflow on Ubuntu 18.04 by running the following command:
pip install apache-airflow
And I get the following error:
OpenID Flask-JWT-Extended sqlalchemy-utils
ERROR: flask 1.1.2 has requirement Werkzeug>=0.15, but you'll have werkzeug 0.14.1 which is incompatible.
How can I resolve this error and install Airflow?
Judging by the error, you are probably having a version conflict for one of the Airflow's dependencies: Flask requires Werkzeug module with a version >=0.15, but you have version 0.14.1 installed.
To avoid the problem, you can install Airflow into a Python virtual environment. For example, using pipenv.
Install pipenv into your user's install directory:
pip3 install --user pipenv
export PATH=$PATH:~/.local/bin/
Install Airflow into a Python virtual environment:
mkdir my_airflow
cd my_airflow
pipenv install apache-airflow
Activate the virtual environment:
pipenv shell
Initialize the Airflow's database:
airflow initdb
At this point you should have a working Airflow installation. Notice, it is not a production ready installation, but it is enough to continue with the tutorial.

Resources