Currently, I am developing a jupyterhub server inside a Kubernetes cluster, hence using "Kubespawner" with autoscale on a cloud platform. It took a while to spawn a notebook server due to the nature of "auto scale" which is going to scale up when required to have more Notebook servers.
Could anyone please give me advice to gain spawn speed similar to Kaggle? because currently what I have will takes around 3-5 minutes just to spawn a notebook server. While Kaggle could spawn a notebook server just at 30seconds max. It is very impressive compared to mine.
Thanks in advance.
Related
A few days ago I noticed R was using 34% of the CPU when I have no code running. I noticed it again today and I can't figure out why. If I restart R, CPU usage returns to normal, then after 20 minutes or so it ramps up again.
I have a task scheduled that downloads a small file once a week using R, and another using wget in ubuntu (WSL). It might be the case that the constant CPU usage only happens after I download covid-related data from a github (link below). Is there a way to see if this is hijacking resources? If it is, other people should know about it.
I don't think it's a windows task reporting error since my temps are what I would expect for a constant 34% cpu usage (~56C).
Is this a security issue? Is there a way to see what R is doing? I'm sure there is a way to better inspect this but I don't know where to begin.. Glasswire hasn't reported any unusual activity.
From Win10 event viewer, I've noticed a lot of these recently but don't quite know how to read it:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID {8BC3F05E-D86B-11D0-A075-00C04FB68820} and APPID {8BC3F05E-D86B-11D0-A075-00C04FB68820} to the user redacted SID (S-1-5-21-1564340199-2159526144-420669435-1001) from address LocalHost (Using LRPC) running in the application container Unavailable SID (S-1-15-2-181400768-2433568983-420332673-1010565321-2203959890-2191200666-700592917). This security permission can be modified using the Component Services administrative tool.
*edit: CPU usage seems to be positively correlated with the duration R is open.
Given the information you provided, it looks like RStudio (not R) is using a lot of resources. R and RStudio are 2 very different things. These types of issues are very difficult to investigate as one need to be able to reproduce them on another computer. One thing you can maybe do is raise the issue on github to the RStudio team.
I use a Jupyter Notebook instance on Sagemaker to run a code that took around 3 hours to complete. Since I pay for hour use, I would like to automatically "Close and Halt" the Notebook as well as stop the "Notebook instance" after running that code. Is this possible?
Jupyter Notebook are predominantly designed for exploration and
development. If you want to launch long-running or scheduled jobs on
ephemeral hardware, it will be a much better experience to use the
training API, such as the create_training_job in boto3 or the
estimator.fit() of the Python SDK. The code passed to training jobs
can be completely arbitrary - not necessarily ML code - so whatever
you write in jupyter could likely be scheduled and ran in those
training jobs. See the random forest sklearn demo here for an
example. That being said, if you still want to programmatically shut
down a SageMaker notebook instance, you can use that boto3 call:
import boto3
sm = boto3.client('sagemaker')
sm.stop_notebook_instance(NotebookInstanceName='string')
R-shiny Application loads pretty quickly if run locally but seems to take an inordinate amount of time to load when run through shinyproxy. I have tried tweaking the heartbeat and load-wait times, and springboot although slow doesn't seem to account for all the delay, I'm running centos7 with the latest shinyproxy rpm install. Docker has been configured correctly (the containers do start eventually).
The shiny logs don't give much detail other than starting docker and proxy enabled.
Has anyone come across an issue similar to this before? Is this normal?
I think it is normal due that shinyproxy needs to start a docker service for every user that log on your app.
The build of a meteor project takes a long time if performed in a docker image (> 30 min) but if done on a local machine ore directly on the server without docker everything works fine (2-10 min).
After some research I still do not know how to tackle this issue. Can anyone provide me with further instruction how to debug my docker? The performance seems really bad.
What should be the minimum server hardware to get a quick build time? The project itself is not large.
I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes?
Say I want to use two nodes, 32 cores per node:
cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")
Right now, it just hangs.
What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?
Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.
I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.
If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:
workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)
The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.