How to Choose R Server's R as Default in Operationalization, Remote R Workspace and RStudio Server? - r

So I've set up an Azure Data Science Virtual Machine on Linux (Ubuntu) and I've executed the following on the terminal to enable Remote R workspace, RStudio Server, R Server Operationalization and hadoop:
sudo apt update
sudo apt -y upgrade
# Hadoop is installed but doesn't seem to appear on the PATH or have its environment variable set by default
sudo echo "" >> ~/.bashrc
sudo echo "export PATH="'$'"PATH:/opt/hadoop/hadoop-2.7.4/bin" >> ~/.bashrc
sudo echo "export HADOOP_HOME=/opt/hadoop/hadoop-2.7.4" >> ~/.bashrc
#
source ~/.bashrc
#Setting up a password as none exists to begin with because of private key selection in the installation
#RStudio Server requires a password though
"MyPassword\nMyPassword\n" | sudo passwd sshuser
#Unfortunately hadoop fails on Data Science Virtual Machine
#error: mkdir: Call From IM-DSonUbuntu/192.168.5.4 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
# hadoop fs -mkdir /user/RevoShare/rserve2
# hadoop fs -chmod uog+rwx /user/RevoShare/rserve2
sudo mkdir -p /var/RevoShare/rserve2
sudo chmod uog+rwx /var/RevoShare/rserve2
# hadoop fs -mkdir /user/RevoShare/sshuser
# hadoop fs -chmod uog+rwx /user/RevoShare/sshuser
sudo mkdir -p /var/RevoShare/sshuser
sudo chmod uog+rwx /var/RevoShare/sshuser
#Setting up R Server Operationalisation
cd /opt/microsoft/mlserver/9.2.1/o16n
sudo dotnet Microsoft.MLServer.Utils.AdminUtil/Microsoft.MLServer.Utils.AdminUtil.dll -silentoneboxinstall MyPassword
#They say this Data Science Virtual Machine already has RStudio Server, but even though the port 8787 is open, it's nowhere to be found! So installing it now, and after the installation it's accessible by refreshing the page that failed before.
#Perhaps it's not installed then? Or a service is not running like it shoudl?
#https://www.rstudio.com/products/rstudio/download-server/
wget https://download2.rstudio.org/rstudio-server-1.1.414-amd64.deb
yes | sudo gdebi rstudio-server-1.1.414-amd64.deb
#They are small, leave them for debug reasons - lets have evidence the script run thus far.
#sudo rm rstudio-server-1.1.414-amd64.deb
# Remote R workspace Service needs dotnet sdk
curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg
sudo mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg
sudo sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-xenial-prod xenial main" > /etc/apt/sources.list.d/dotnetdev.list'
sudo apt update
sudo apt -y install dotnet-sdk-2.0.0
sudo apt install libxml2-dev
#Downloading and installing the Remote R service
wget -O rtvs-daemon.tar.gz https://aka.ms/r-remote-services-linux-binary-current
tar -xvzf rtvs-daemon.tar.gz
sudo ./rtvs-install -s
sudo systemctl enable rtvsd
sudo systemctl start rtvsd
#sudo rm rtvs-daemon.tar.gz
#sudo rm rtvs-install
#Fixing Remote R: For some reason, even though 'sudo systemctl enable rtvsd' runs, after every reboot the service won't become automatically active. So let's fix that.
wget https://sa0im0general.blob.core.windows.net/general-blob-container/StartRemoteRAfterReboot.sh
sudo mv StartRemoteRAfterReboot.sh /var/RevoShare/StartRemoteRAfterReboot.sh
sudo /sbin/shutdown -r 5
sudo chown root /etc/rc.local
sudo chmod 755 /etc/rc.local
sudo systemctl enable rc-local.service
sudo -s
sudo find /etc/ -name "rc.local" -exec sed -i 's/exit 0//g' {} \;
sudo echo "" >> /etc/rc.local
sudo echo "sh /var/RevoShare/StartRemoteRAfterReboot.sh" >> /etc/rc.local
sudo echo "exit 0" >> /etc/rc.local
exit
I've also tried, one by one, these, to see if it makes any difference to the RStudio Server (it didn't, but even if it did, I want a global solution to work on Remote R Workspace Service and R Server Operationalisation as well, not only RStudio Server):
#Configuring RStudio Server to see the R Server R
sudo echo "rsession-which-r=/opt/microsoft/mlserver/9.2.1/bin/R/R" >> /etc/rstudio/rserver.conf
export RSTUDIO_WHICH_R=/opt/microsoft/mlserver/9.2.1/bin/R/R
sudo echo "RSTUDIO_WHICH_R=/opt/microsoft/mlserver/9.2.1/bin/R/R" >> ~/.profile
source ~/.profile
sudo echo "RSTUDIO_WHICH_R=/opt/microsoft/mlserver/9.2.1/bin/R/R" >> ~/.bashrc
source ~/.bashrc
sudo echo "PATH=$PATH:/opt/microsoft/mlserver/9.2.1/bin/R" >> ~/.bashrc
export PATH=$PATH:/opt/microsoft/mlserver/9.2.1/bin/R
source ~/.bashrc
The problem is that even though "which R" points to R Server's R, i.e. typing "sudo R" will show the message "Loading Microsoft R Server packages, version 9.2.1." and will load packages like RevoScaleR, everything else fails to do so.
Accessing the RStudio Server with http://THE-IP-GOES-HERE.westeurope.cloudapp.azure.com:8787 and logging in with the initial user ("sshuser") (or with any other user for that matter) will NOT load R Server and RevoScaleR rx functions are unavailable
Using my local Visual Studio 2017 to access the remote workspace via "Add connection" on "Workspaces" tab loads MRO and says:
Installed R versions:
[0] Microsoft R Open '3.4.1.1347' (Default)
And finally, when I use R Server's Operationalisation and log in with "mrsdeploy" package's "remoteLogin()" R Server packages like RevoScaleR are not loaded again, so things like "rxSummary(~., data=iris)" fail with error 'could not find function "rxSummary"'
The exact same thing happened when I deployed from azure a "Machine Learning Server 9.2.1 on Linux (Ubuntu)".
I don't want to just use the regular open source R, I want to be able to use the R Server - that's why I deployed this VM. How can I make it so that everything loads R Server's R, not Microsoft R Open? (Like I'm able to do from terminal using "R")
As a result of my having tried all of this and the fact that R Server is loaded in the console, my mind now goes to permissions. Could it be that by default the Data Science VM doesn't have the correct permissions to allow these?
I'm at a loss

RStudio Server is installed on the Ubuntu DSVM, but the service is disabled by default as it does not support SSL. You can enable it with systemctl enable rstudio-server, then start it with systemctl start rstudio-server.
RStudio Server uses the same R as Microsoft R Server, but the .libPaths are different, which is why you cannot load the MRS packages. You will need to manually set the .libPaths so they match.

Related

Making HTTPS requests within a Docker image behind a Zscaler firewall

I'm interested in running a simple image like this behind a corporate Zscaler firewall:
FROM rocker/r-base
RUN apt-get update && apt-get install libssl-dev
CMD Rscript -e "install.packages('beepr')"
Building the image with docker build -t test . fails with errors like this:
Certificate verification failed: The certificate is NOT trusted. The certificate issuer is unknown. Could not handshake: Error in the certificate verification. [IP: ]
I've tried some of the solutions from here but they don't work. For example:
FROM rocker/r-base
# Add local certificate to Docker
ADD ./zscaler.cer /usr/local/share/ca-certificates/zscaler.crt
# Move the certificate to the cert dir of openssl and update certificates
RUN CERT_DIR=$(openssl version -d | cut -f2 -d \")/certs ; cp /usr/local/share/ca-certificates/zscaler.crt $CERT_DIR ; update-ca-certificates
# Try making https requests
RUN apt-get update && apt-get install libssl-dev
CMD Rscript -e "install.packages('beepr')"
Same errors persist with docker build -t test .. I've read some possible solutions online but all of them continually fail either for apt-get or for installing packages with R. Is there anyone who has experienced this and found a fix?
Apparently, the current advice is slightly wrong. The certificate should not go in /etc/ssl/certs/ (which is the result of CERT_DIR=$(openssl version -d | cut -f2 -d \")/certs) but rather on CERT_DIR=/usr/local/share/ca-certificates/ (at least on this Ubuntu image). After changing that, update-ca-certificates correctly updates the certificate an all HTTPS requests are successful.
This should work now:
FROM rocker/r-base
# Add local certificate to Docker
ADD ./zscaler.pem /usr/local/share/ca-certificates/ZscalerRootCertificate-2048-SHA256.crt
# update certificates
RUN update-ca-certificates
# Try making https requests
RUN apt-get update && apt-get install libssl-dev
CMD Rscript -e "install.packages('beepr')"

Install systemd service on Debian installation

I'm building custom Debian ISO with simple-cdd utility. It worked well till the moment when I attached my own .deb package.
build-simple-cdd --dist stretch --profiles moj --force-root --local-packages /root/iso/deb
build-simple-cdd works properly, because I saw my deb package in tmp directory structure and iso image is created successfully. However debian installation fails
I suspect, that postinst script fails, since it uses systemctl command when it may be unavailable.
#!/bin/sh
set -e
echo $1
if [ "$1" = "configure" ]; then
echo "Configuring privileges..."
chown user:user /usr/bin/Koncentrator
chmod 0755 /usr/bin/Koncentrator
echo "Enabling Koncentrator services..."
systemctl daemon-reload
systemctl enable Xvfb.service
systemctl enable Koncentrator.service
fi
I've added systemd dependency to control file, but it doesn't work.
I made workaround for this issue. simple-cdd allows to prepare post installation script. apt install is called there without problems. Two steps are required to use this solution:
Add deb package to installation disk. This is configured via profile configuration file (moj.conf):
all_extras="$all_extras /root/iso/files/customapackage_0.1.3.deb"
Run apt install in moj.postinst script:
#!/bin/sh
mount /dev/cdrom /media/cdrom
cd /media/cdrom/simple-cdd
apt install ./custompackage_0.1.3.deb
cd /
sync
umount /media/cdrom
If you want to debug your postinst script, you can insert there long sleep:
#!/bin/sh
sleep 10000000
...
And switch terminal (Ctrl+Alt+F1-6) during finish-install phase. Than call chroot /target to switch in-target environemnent

How to mount Cloud Filestore in GCP AI platform Jupyter notebook?

I want to mount a Cloud Filestore instance in a GCP AI Platform Jupyter notebook instance so that I don't have to upload all of my data into the notebook.
I followed the instructions at https://cloud.google.com/filestore/docs/mounting-fileshares, but get these error messages:
root#0084329abd1b:/home# mount <IP_ADDRESS>:/streams cfs
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
root#0084329abd1b:/home# mount -o nolock <IP_ADDRESS>:/streams cfs
mount.nfs: Operation not permitted
From your terminal, you can do something like this.
mkdir des_bucket
gcsfuse --debug_gcs --implicit-dirs src_bucket des_bucket
Create a Filestore instance link
Crerate a Google VM instance link
Create a Notebook AI instance link
On the VM instance run the commands:
sudo apt-get -y update
sudo apt-get -y install nfs-common
sudo mkdir test
# fileshare remote target
sudo mount 111.11.111.11:/fileshare test
sudo chmod go+rw test
echo 'This is a test' > test/testfile
ls test
#testfile
On the Notebook AI instance run the commands link:
sudo apt-get -y update
sudo apt-get -y install nfs-common
sudo mkdir test
# fileshare remote target
sudo mount 111.11.111.11:/fileshare /test
ls test
#testfile
You can also check link

Error installing Meteor on linux x86_64 chrome os

I am trying to install Meteor on the HP14 Chromebook. It is a linx x86_64 chrome os system.
Each time I try to install it I run into errors.
The first time I tried to install it the installer just downloaded the Meteor preengine but never downloaded the tarball or installed the actual meteor application structure.
So, I decided to try as sudo.
sudo curl https://install.meteor.com | /bin/sh
This definitely installed it because you can see it when ls
chronos#localhost ~/projects $ chronos#localhost ~/projects $ ls /home/chronos/user/.meteor/
bash: chronos#localhost: command not found
Now when I try to run meteor --version or meteor create myapp without sudo I get the following error.
````
chronos#localhost ~/projects $ meteor create myapp
'/home/chronos/user/.meteor' exists, but '/home/chronos/user/.meteor/meteor' is not executable.
Remove it and try again.
````
When I try to run sudo meteor --version or sudo meteor create myapp I get this error.
chronos#localhost ~/projects $ sudo meteor create myapp
mkdir: cannot create directory ‘/root/.meteor-install-tmp’: Read-only file system
Any ideas? Thinking I have to make that partition writeable. I made partition 4 writeable.
Put your chrome book into dev mode.
http://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices
Boot into dev mode.
ctrl-alt t to crosh
shell
sudo su -
cd /usr/share/vboot/bin/
./make_dev_ssd.sh --remove_rootfs_verification --partitions 4
reboot
After rebooting
sudo su -
mount -o remount,rw /
mount -o remount,exec /mnt/stateful_partition
Write yourself a read/write script
sudo vim /sbin/rw
#!/bin/bash
echo "Making FS Read/Write"
sudo mount -o remount,rw /
sudo mount -o remount,exec /mnt/stateful_partition
sudo mount -i -o remount,exec /home/chronos/user
echo "You should now have full Read/Write access"
exit
Change permissions on script
sudo chmod a+x /sbin/rw
Run to set read/write root
sudo rw
Install Meteor as indicated on www.meteor.com via curl and meteor create works!
Alternatively you can edit the chomeos_startup though that might not be the best idea. It is probably best to have read/write on demand as illustrated above.
cd /sbin sudo
sudo vim chromeos_startup
Go to lines 51 and 58 and remove the noexec options from the mount command.
Down at the bottom of the script, above the note about ureadahead and below the if statement, add in:
mount -o remount,exec /mnt/stateful_partition
#uncomment this to mount root r/w on boot
mount -o remount,rw /
Again, editing chromeos_startup probably isn't the best idea unless you are so lazy you can't type sudo rw.
Enjoy.
This is super easy to fix!!
Just run this (or put it in .bashrc or .zshrc to make it permanent):
sudo mount -i -o remount,exec /home/chronos/user
Based on your question (you are using sudo) I assume you already have Dev Mode enabled, which is required for the above sudo command to work.
ChromeOS mounts the home folder using the noexec option by default, and this command remounts it with exec instead. And boom, Meteor will work just fine after that (and so will a bunch of other programs running out of your home folder).
Original tip: https://github.com/dnschneid/crouton/issues/928

EC2 startup script gets stuck on wget

I have the following script for some number crunching
#!/bin/bash
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y r-base r-base-dev htop s3cmd p7zip-full
wget https://s3.amazonaws.com/#######/###.7z
7z e ###.7z
sudo R CMD BATCH --slave --no-timing --vanilla "--args 0 1 100 200 500 2" SOME-ROUTINE.R
s3cmd put *.results s3://#########/
on EC2. I upload the script as file at the Launch Instance->Instance Details->User Data
The machine fires up, updates and upgrades but then it does not execute wget and does not download the file. When i SSH in the Instance and run the exact same commands the process completes without problems.
Any ideas why wget does not work?
Any other alternatives?
EC
It is always a bit of guessing, but here is how I would debug this:
My first suggestion would be to check for special characters in the S3 URL. This might cause the wget call to fail.
Second, I would give an explicit output path to wget with the -O option. While you are editing the command, you can also add -o to output logging information.
Last step is to check your access rights to the S3 bucket. Perhaps you can try to put the file on another webspace to see if the command executes then.

Resources