R segue createCluster() issue - r

I'm trying to create a cluster on EC2. I have an account setup and validated with AWS. I have successfully downloaded and installed the segue package and related packages and set my AWS credentials. My problem starts when I try to create a cluster and I get the following:
> library(segue)
Loading required package: rJava
Loading required package: caTools
Loading required package: bitops
Segue did not find your AWS credentials. Please run the setCredentials() function.
> setCredentials('', '') #keys hidden
> myCluster <- createCluster(numInstances=5)
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
com.amazonaws.AmazonClientException: Can't turn bucket name into a URI: Illegal character in authority at index 8: https://c:\users\backup~1\appdata\local\temp\rtmp4u0n8yqaaoducils-segue.s3.amazonaws.com
Any ideas?

acesnap, I'm the author of Segue and I can say with confidence that the issue you're running into is that the Segue package has not been implemented to run on the Windows platform. I'm suspicious that the issue is that windows does funny things with file paths, temp files, and the like. The server side of the Segue package is always the Amazon Elastic Map Reduce service which runs Linux, but temporary files are built on the client machine and so Segue must talk nice with the local operating system.
There are several work-arounds I can think of:
Set up Virtual Box on your local machine and get Ubuntu and R installed.
Set up an EC2 machine and install R and Segue and then use that machine to fire off Segue jobs.
Buy a Mac or install Linux on a desktop machine (kinda obvious, I guess)
Even though my desktop machines are Mac and Linux, I use #2 above frequently. I do this because it speeds up the communication between the machine running Segue and the backend cluster. It also reduces the probability that the Segue main machine will lose connectivity to the EMR backend. This is valuable because if communication is lost between Segue and the amazon cloud while a job is running then the job will run on the cloud cluster, but have no way of returning results to the Segue main machine (the machine you submit jobs from).

Related

Install Openstack on single node

I want to install Openstack on CentOS 8(single node). I am having single machine (physical machine) where I want to install all nodes of Openstack. This setup I required for simulation only not production use.
I have tried to install Openstack using packstac 3 times but couldn't success.
I got different issues during installation:
1.In first attempt After installation, I tried to create instance, but not getting console of instances even after it got created successfully.
2. In second attempt, during deployment of instance, network not getting allocated.
3. In third attempt, it got stuck at packstack, puppet testing only.
I have followed below 2 links:
https://computingforgeeks.com/install-openstack-victoria-on-centos/
https://www.google.com/amp/s/www.linuxtechi.com/install-openstack-centos-8-with-packstack/amp/
I followed each and every steps mention in the likns.
I want to create two Ubuntu VMs on Openstack.
Can someone provide me some links/video, where I can get everything which is required to install Openstack on single node and create two Ubuntu VMs and assign network to them and test the connectivity between these two VMS.
Thanks in advance.
I would use official Packstack documentation. Note that you should start with a totally fresh Centos installation; i.e. don't try to install Packstack on a server where a previous installation failed (or succeeded).
You can also try Devstack. Its default configuration requires a smaller machine than Packstack (in my experience, 8GB RAM should be sufficient). Same remark: Start with a fresh installation of Centos or Ubuntu.
Microstack is another alternative. Its advantage is a very simple and quick installation; its disadvantage is a very strange (in my opinion) configuration and not a lot of documentation. However, it is suitable for your purpose. It claims to work on any Linux, Windows and MacOS; it does require snap.
I suggest directly installation onto Ubuntu Server.
some time ago I wrote a serie of posts in which I explained in detail how to install OpenStack Rocky. The 2 first blog posts ([1] and [2]) contain commands, examples, content of configuration files that cover common scenarios and tips for the successful installation of most OpenStack services (keystone, nova, glance, etc.) in a single node, and the third post [3] describes the installation of a computing node. This 3rd post is installed in a different node for the sake of making it easier to understand how nova works, but the installation can be safely carried out in the same node than the other components.
I find that the posts are short enough and are very easy to follow (I use that blog as my installation tips, and so I have used them for several deployments). The only caveat is that it is based on Ubuntu, but if you know about your installation, it should be easy to translate the installation to CentOS (some colleagues have used these tips for CentOS installations).
I tried to install Openstack several times last week (october 2021): a) with CentOS 8 Stream to metal hardware (real server) with devstack - no one version was installed (neither Master nor Xena & Wallaby, version Viktoria & below are not for Stream OS); b) Virtual machine with CentOS 8 Stream installed with packstack - installation was clearly successful (!), quite easy for install (according to official RDO project and its homepage), however there is the real problem with virtual and actual networking: no external network is accessible, router created was OK with external connection (router IP was detected successfully from outside) but no connection was possible from and to instance. So I conclude the Openstack package is not completely documented to resolve problems, however its installation can be quite easy (when successfully finish ;) )
Addition: Of coarse, there are resources with an information how network can be configured, official Openstack docs describes different network configurations as well (however it is difficult to find it for one click and being newbie), but anyway this system requires a lot of time to study before usage.

Running R code via Rstudio on a remote server not by browser

Is there a way to use a local Rstudio installation on my machine which is actually running the code on a remote server where I can run distributed jobs via SLURM?
Can it be compatible with version control and dockers?
The remoter package does what you are wanting to do very well. You start R on the remote server and run remoter :: server(showmsg=TRUE). Then in you local RStudio you run
remoter :: client (). Works fairly flawlessly.
My main issue is that when you run help it comes from the remote session in the console rather than the help window.
https://cran.r-project.org/web/packages/remoter/vignettes/remoter.pdf

Redshift JDBC connection crashes on second opening in R

I am using the RJDBC package to connect to AWS Redshift from an EC2 ubuntu instance.
I can successfully connect using the JDBC() call, retrieve/insert rows and then close the connection.
However, when I re-open a second connection in the same R session, R crashes with a segmentation fault. This happens in both R Studio and console R. I'm using conda to manage the R.
I have tried the connection using the native redshift jar provided by Amazon and also another jar from Progess Software. I get the same effect with both drivers: first connection is fine, subsequent connections crash.
I've installed the latest JVM v8. I had seen some other threads that suggested installing v6 as a workaround, but unfortunately that is no longer available at the oracle site.
My gut feeling is that Java has a weird interaction with R, but I'm at a loss as to how to proceed.
OK, I solved this myself and thought I'd record in case this is useful to others.
The problem was really with rJava not re-initialising the JVM correctly.
I added the following line before opening a database connection:
rJava::.jinit(force.init = TRUE)
Now I can open and close connections without issue using RJDBC

Open Stack Volume won't attach

I am using openstack to create a Centos7 VM.
I can get the VM to run but the installer hits a snag at the first page.
It needs a Disk to install to (Installation Destination)
I thought this was the volume that I attached using the openstack app. I used the volume's edit attachments and it pops up saying it will attach it; the volume is never listed as attached to ANY instance I attach it to.
It also needs an Installation Source, which I was using the URL from the mirror site I used. Here is the URL:
ISO URL
I used the net Install ISO. I tried the same url for the installation source and I also tried the URL but change isos to os or this:
OS URL
Thanks for any help.
when you create VMs in Openstack you are not supposed to go through the installation process. In the cloud you use cloud images that are ready to boot.
You should use a Centos cloud image.
Try to load this Centos7 image into your openstack glance:
http://ubuntu.mirror.cloud.switch.ch/engines/images/2016-04-15/centos7.raw
You should be able to boot your VM and boot with the username centos and the public key you provide with cloud-init.

RStudio Server on ec2 - Not persistent when closing browser tab

I am running RStudio server on an ec2 instance (using Louis Aslett's AMI) and connect through the browser.
I have some long scripts to run and thought I would be able to leave them running and close the browser tab/turn off my computer.
However, when I do this it seems to interrupt the console and when I log back into the server (pasting address into address bar and logging back in) I am met with an alert telling me that the R session terminated and my workspace is completely reset (working directory reset, and any data or variables lost).
Note that I am not terminating the instance, I am simply closing the browser tab that RStudio is loaded in.
Am I doing something wrong? Is there a proper way to disconnect safely and prevent this from happening?
Thanks
The author of the AMI implies that the AMI is based on Linux, so you can run screen before launching your RStudio server session.
The screen package is bundled with most Linux distributions. The author doesn't mention which distro his AMI is based on or list all of the included packages, but if the AMI doesn't have it, then you can use a package manger to install it:
sudo apt-get install screen -y
if your package manager is apt. The installation using the yum package manager is similar.

Resources