R Studio incredibly slow only after I connect to my work VPN? - r

I have R Studio on Windows. It works fine before I connect to the internet through my VPN. After I connect commands start to hang, autocomplete can take forever, and simple operations like 4 + 4 can take one minute or more.
I have a feeling the studio is making connections under the hood. I would like to disable all of these connections no matter what.

I have experienced the same thing - I am assuming your actual work is on a file server and not local. I found the culprit in my case to not be RStudio directly, but rather the RStudio project file. RStudio will create a generally small, hidden folder in the directory named .Rproj.user with some settings. This, living on the file server, caused constant read/writes through my VPN connection.
I unfortunately had to either (a) move projects off the file server into a local copy (not bad since I can use a company GitLab), or (b) delete the .Rproj and .Rproj.user folders from the project directory on the server and use here::here() or something like that as a work around in my workflow.
As another possibility, I have seen installations of R itself done onto a personal server drive instead of locally. This has been done to avoid needing administrator privileges to install. This is not a great idea and can also result in extremely slow performance through a VPN connection. You can check to see where R is installed as well. Sounds like this is not the problem though based on what you described.
Maybe it is something else, but this is what I found last week for me based on a very similar experience.

Related

R and Rstudio Docker vs Binder

My problem is that I can't use R-studio at my work place as the IT does not support it . I want to use R and R-studio that installed on my personnel laptop on my company laptop ( using a modern browser which is behind firewall ) . Some of the options I am thinking of two two things
should I need to build a docker for R and R-studio (I see base images are already available) , I am mostly interested in basic R , Dplyr (haven ,xporter, and Reticulate ) packages .
Should I have to use a binder . I am not technical person and my programming skills are very limited can any one suggest me way .
What exactly are the difference between using Docker option vs Binder ?
I know I can use R-Studio online and get my work done but with the new paid account I am running out of project hours and very slow sometimes . Thanks in advance
Here are some examples beyond the modern RStudio MyBinder example:
https://github.com/fomightez/pythonista_skewedf
https://github.com/fomightez/r_phylogenetics_worshop
https://github.com/fomightez/chapter7/tree/master/binder
The modern RStudio MyBinder example has been set as a template on GitHub so you can use
The first one is for a special use of a package not on conda. And I started that one from square one.
The other two were converted from content by others to aid in making them Binder-ready.
You essentially list everything you need from conda in the environment.yml along with the appropriate channels. If you need special stuff not on conda, you need the other configuration files included there.
Getting everything working can take some iterations on adding things, letting the image get built, and testing your libraries are available. Although you seem to think your situation is not overly complex.
The binder launch badges you see are just images where you modify the URL to point the MyBinder federation site at your repository. Look at the URL and you should see the pattern where you put studio at the end of the URL pointing at your repo. The form at MyBinder.org site can help with this; however, most often it is easier to just adapt a working launch badge's code copied from elsewhere. The form isn't set up at this time for making the URLs for launching to RStudio.
Download anything useful your create in a running session. The sessions timeout after 10 minutes, although RStudio usually keeps them active.
Lack of Persistence and limited memory, storage, & power can be drawbacks. The inherent reproducibility and portability are advantages.
MyBinder.org doesn't work with private repos. If you have code you don't want to share, you can upload it to the temporary session, using the repo for specifying the environment. You could host a private binderhub that does allow the use of private git repositories; however, that is probably overkill for your use case and exceed your ability level at this time.
GitHub isn't the only place to host repositories that can be pointed at the MyBinder system. If you go to the MyBinder.org page and click where it says 'GitHub' on the left side of the top line of the form, you can see a list of the sources at which you can host a repository and point the system to build an image and launch a container with that specified image.
Building the image from a source repository takes some minutes the first time. Once the image is built though on the service, launch is typically less than 30 seconds. Each time you make a change on the source repo, a build is necessary. Some changes don't cause the new build to be as long as the initial one as some optimizing is done to only build what is necessary after a change. Keep in mind there are several members of the federation around the workd and if traffic on the internet gets sent to where the built image isn't yet available, it will be built from scratch again first.
The Holepunch project is out there to offer some help for users working in the R ecosystem; however, with the R-Conda system that is now integrated into MyBinder it is pretty much as easy to do it the way I described. Last I knew, the Holepunch route makes a Dockerfile that isn't as easy to troubleshoot as using the current the R-Conda system route. Dockerfiles are essentially a last ditch configuration file that MyBinder can handle. The reason being the other configuration files are much easier and don't require knowing Dockerfile syntax. MyBinder aims to offer the ability to take advantage of Docker offering containers with a specified environment without users needing to know anything about Docker.
There is a Binder Help category for posting to get help at the Jupyter Discourse Forum. Some other examples of posts already there may help you troubleshoot.
Notice of a common pitfall
Most of the the configuration files for making a repository Binder-ready are simply text and can be edited right in the GitHub browser interface, without need to git or even cloning the repo locally.
Last I knew, there are two exceptions to this. The postBuild and start configuration files have settings that allow them to be run as scripts and these get altered in a way they no longer work if you edit them via the GitHub browser interface. (This was my experience when last I tried. Your mileage may vary or things may have changed now.) To edit those, you have to have git available on a system you have and pull one from some other source. Then edit that on your machine that has git working & add it your repo and push it back up from your local computer.
(If this is a problem, you can post in the Jupyter Discourse Forum Binder help category and you and I could coordinate where I fork and edit those files in your repo to your specifications and then make a pull request to update your source of the fork with those changes.)
If you are using Jupyter notebooks extensively then it may make sense to use Binder
But if you simply want to use R and Rstudio, then all you need is docker. A good resource is
https://github.com/rocker-org/rocker

R studio not responding

My RStudio is opening multiple processes(opening R Studio in different windows) when I open just one. I am not able to open any project. It becomes unresponsive. I tried using in compatibility mode, running as admin and also uninstalled. But still,(here's the image) the same problem persists. Can someone please help me?
Before you reinstall everything, which may take a lot of time, it is worth removing application data, which is a cache R keeps of last sessions etc. It involves settings such as information from last session, etc which RStudio tries to reopen every time you open it.
For app data, look under your user folder which must be somewhere like
c:\Users\<your_user_here\AppData\Local
c:\Users\<your_user_here\AppData\Local\LocalLow
c:\Users\<your_user_here\AppData\Roaming
Delete every subfolder called R, RStudio-Desktop or RStudio under these folders. Don't worry, you won't lose your source program files and projects. It may help you recover everything without having to start over from scratch.

Why does "move directories" on a NAS take so long?

I am quite unsure of how the move files/directories use case in a client and NAS scenario technically works - perhaps someone can enlighten me or tell me if this is normal OS-behavior.
I have a NAS ( Synology DiskStation ) in a Gigabit-LAN with sometimes big directories ( in the range of ~ 10GB ) which I want to move somewhere else on the same NAS ( even on the same hard disk ).
The problem is that if I move a directory from lets say
//diskstation:/dir_foo/dir_1/src_1
to
//diskstation:/dir_foo/dir_2/
via my Windows 7 Desktop PC in Explorer ( I even tried it in Finder on MacBook ) this can take up to 10 Minutes (or the like) and I really wonder why this is the case.
To me this seems as if the whole data was first transported over LAN to my client PC and then afterwards moved back to the NAS!?
Shouldn't the explorer or the NAS notice that this is local file operation so that the data doesn't have to be transported through my LAN and the movie should be much quicker?
How can I analyze if the file movement is really executed over LAN? Because if i wanted to do these kind of operations via VPN from external, it would be pretty much unusable...
Is this normal behavior?
It's hard to give a firm answer, because it depends. What access protocol are you using, and what operation are you performing? Is it a drag-drop in your GUI?
Your NAS does what it's told. It almost certainly implements some sort of internal rename function, that means you don't need to copy data in order to 'move' it.
If you do this from the command line, using 'move' or 'mv' (depending on DOS/Unix) do you have the same problem? I'm prepare to bet you don't, because you're telling the NAS to rename, and it will, and it'll be fine.
Move it from the GUI instead of the file explorer.
If you are using your windows explorer for moving the files then your OS will first download the file from source directory to client PC and then upload it to target directory, this is because your using SAMBA shares.
If you want to move files quickly within your nas then best way would be to use putty or WinSCP which uses ssh & ftp etc.

RStudio server - Hangs when switching projects

I am currently running RStudio via a server installation that I only partially maintain. I am working with some fairly large data sets and models (> 9 million rows of 611 variables). When I try to switch projects, RStudio hangs when loading the project (it says "Switching projects to..." at the top) or, if it loads, takes forever.
RStudio works otherwise while attempting to switch projects, but menus and the like do not work.
I have searched thoroughly for a fix to no avail. How would I go about troubleshooting (or, ideally, fixing) the problem?
RStudio is running on a linux (Open SuSE) VM.
Thanks in advance.
EDIT: Per this thread: https://stackoverflow.com/a/15373596/3469671, I deleted .rstudio from my home directory and that seemed to free things up. Is there some setting I can change to facilitate the loading of larger projects?
Per this thread, I deleted .rstudio from my home directory and that seemed to free things up.
As a practice, I learned to stop saving my workspace while it was loaded with data and included steps within projects to save and load prepared data.

Running Visual Studio in Parallels for mac - problem with debugging sites sitting in os x drive

I've installed parallels desktop on my MacBook to be able to run Visual Studio 2008 in a XP installation. Everything works great except when I decided to put my websites in my sites folder in the os x file system (Which by default automatically happens because the My Documents folder is mapped to the Mac's Documents folder, and I'd rather put my code there so that both OS's can easily access it.).
When trying to build or debug I get this error:
Failed to start monitoring changes to 'Z:\xxx...'
How do I get it so that I can get it to work under Parallels, from the shared drive?
Parallels uses network drives to simulate folders on OS X, and Windows can't monitor changes to network drives, so if you do this directly, it'll be broken.
If you want to keep them in sync though, use Live Mesh (http://www.mesh.com) and install it on both the host and guest. A little roundabout, but it'll make it so both copies are maintained (and Live Mesh is handy for other things too)
I recently flipped over to putting my source code onto my Mac volume, so I could use Time Machine to back it up and immediately got this same problem with my ASP.NET app. Other, procedural applications, built just fine, by the way.
I tried all sorts of things, including using Samba on the Mac side to share the directory, which led into the "too many BIOS commands" error described elsewhere. Unfortunately for me, the Registry hacks to fix that problem never worked for some reason.
I finally found another solution that avoids Samba and just uses the regular Parallels Shared Folders. It too is a Registry hack, but this one simply turns off file change monitoring for ASP.NET. It is a bit heavy-handed, but gets my builds to work again.
The reference for this change is here:
http://support.microsoft.com/kb/911272
The downside to this approach, I am finding, is that you need to be more deliberate about recompiling, or restarting the web server, as changes during development don't just magically appear anymore. I am still deciding whether that is a useful tradeoff.
UPDATE: After several days of this, development was just too difficult and, sadly, what I reverted to was keeping my source inside the Parallels virtual disk. To enable Time Machine backups and Spotlight searches, I used a lightweight MS utility called SyncToy to push stuff out of Parallels and out to my Mac drive several times a day. Despite the high hack factor, it is working well.
I know this isnt strictly a solution but VMware fusion is superior when it comes to shared drive space on a virtual machine. Its what i currently use and hasn't let me down thus far...
People always give me odd looks when they see visual studio on my mac :P
Try moving the project on to the VMs C drive. Its not an ideal situation, but you can access the VMs C drive from OS X.
I have a similar problem with a php site that uses an MS Access database (its a clients system). I have alias's that point to the php site on the VM so that I can still do all of my coding in OS X. To do this I created a network share on the VM and then connected to it from OS X. Once connected make the alias's. If the network drive is not open and you open a file in OS X it will try to reconnect. It means the VM will need to be running to get to the files, but this isn't normally a problem since the VM is hosting the site anyways.
.NET has funny issues trying to debug the objects on a network drive.
make sure that you have full trust on your local network between your Mac and XP install.
Check out: http://msdn.microsoft.com/en-us/library/aa302361.aspx
If at the end of that research, I"m afraid you will have to look into the option of keeping it on the VMDisk and moving it when you need it.
I see a similar problem on my machine connected to the windows domain. My documents is mapped to a network share and I can't debug|run|etc. I had to eventually move to my local disk for debugging.
I definately recommend Live Mesh as a way to keep directories in sync. Just keep the VM's directory in sync with the Mac's directory.
Or use SVN to hold copies in both machines and do commit/update as appropriate. That way you get versioning, history and if your project grows bigger, you can share with other devs.
I know dropbox also has history and sharing, but not check in/check out/conflicts and all the other advantages of a real source control.
Oh, if you have money you can also go for TFS. I would but it is just too expensive :)

Resources