R_LIBS_USER ignored by R - r

I am using bash on Linux Centos. I set up the R_LIBS_USER from my $HOME/.bashrc. Also tried to set this up on $HOME/.Renviron
# in .bashrc
export R_LIBS_USER=$HOME/lib/R/site-library
cat .Renviron
R_LIBS_USER=$HOME/lib/R/site-library
I made sure R_LIBS_USER is set up properly. Echo show the proper value. However running R from this terminal that gives the proper value of R_LIBS_USER does not pick up this value.
> .libPaths()
[1] "/global/software/r/3.5.0/lib64/R/library"
The .libPaths() only shows the default library path in stead of my personal one. $HOME/lib/R/site-library is not in the output of .libPaths(). When I tried to load libraries inside $HOME/lib/site-library I got package not found error. I can add the personal path from inside R, then I can load libraries inside my personal R lib directory.
> .libPaths(c("/myhome/lib/R/site-library", .libPaths()))
> .libPaths()
[1] "/myhome/lib/R/site-library"
[2] "/global/software/r/3.5.0/lib64/R/library"
I have searched for two days to look for a solution. Other people had similar problems and also had no solutions. I used to be able to pick up the personal library path, but don't know what I changed that abolished this normal behavior.

Finally figured it out, .Renviron file does not do expansion of variable $HOME, not like those in .bashrc. R_LIBS_USER variable will be set to $HOME/lib/R/site-library. This is an invalid path when $HOME is not expanded. R will not tell you that this is wrong. Just silently move on to its normal business (I think this is a design flaw in R, no exception handling).
.libPaths("/nonexistent/path") # has no effect on the result of .libPaths()
# except that it will maintain the default R library and remove any exisiting
# personal library. And will not tell you anything wrong with the nonexistent
# directory
Before the .libPath("/nonexistent/path") call, I had two elements in the vector: one /default/R/library, ether other /my/persoanl/R/lib after the assignment only the former is left. R is full of surprises.

Close and valid solution ... but you can try one more thing here to improve ... since you then would not need to maintain things in multiple places ie you can reuse your exported environment variables in .Renviron
You should be able to achieve that if you replace
R_LIBS_USER=$HOME/lib/R/site-library
with
R_LIBS_USER="${HOME}/lib/R/site-library"
the "${...}/..." (so { brackets around the variable and whole path or expression enclosed in double or single quotes) will/should make the path expansion from .Renviron succeed.

Related

R reads the path to a .app file as a folder on macOS

Currently I am trialing a package that requires defining a path to where an external application is located. This application (for anyone curious) does not have a formal install and is simply a .app file to be placed in any given folder.
In R, I need to define a path to this file location, e.g.,
program_path <- ~/Desktop/Folder/dsi_studio.app
However, R interprets this as a folder, and thus when I try to run a few functions within the package, it claims the program_path I fed it is a directory and not a program/application.
Is there any way to force R to read this path as an application and not as a folder? I even went as far as defining program_path as the .app's Unix Executable File (i.e., dsi_studio.app/Contents/MacOS/dsi_studio), but no dice. I must be missing something here.
Thanks for any help!
With that code you might be getting a result that is a formula rather than a text/character object. Try putting quotation marks around that file path. The ~ is actually a function to create an R formula object. You also might need path.expand since the ~ in MacOS is a shortcut for /Users/user_name/. Try instead:
program_path <- "~/Desktop/Folder/dsi_studio.app"

How to get the absolute path from the context of the current running R script?

Using python, if I need the absolute path from the context of the current running script all I need to do is to add the following in the code of that script:
import os
os.path.abspath(__file__)
This is very useful as having the absolute path I can then use os.path.join to form new absolute paths for my project components (inside the project directory tree) and more interesting is that everything will continue to work without any problem no matter where the package directory is moved.
I need to achieve the very same thing using R programming, that is obtaining the absolute path of the current running R script ( = the absolute path of its file on the disk). But trying to do the same in R turns out to be quite challenging, at least for me as a rather beginner in R.
After a lot of googling, I tried to use the reticulate package to call Python from R but __file__ is not available there, then I found a few threads on Stackoverflow suggesting to play with the running Stack and others suggesting the use of normalizePath. However none of these worked for me when the entire project package is transferred from one directory to another.
Therefore, I would like to know if for example you have the following file/directory tree
base_dir ( = /home/usr1/apps/R/base_dir)
|
|
|___ myscript.R (this is my R script to be run)
|___ data (this is a directory)
|___ sql (this is a directory)
Is there any solution allowing to add something in the code of myscript.R so that inside the script the program can always know that the base directory is /home/usr1/apps/R/base_dir and if later this base directory is moved to another directory then there is no need to change the code and the program would be able to find correctly the new base directory?
R has in general no way of finding this path, because there is no equivalent to Python’s __file__ in R.
The closest you can get is to look at commandArgs() and laboriously extract the script filename (which requires different handling depending on how the script was launched!). But this will fail if the script was executed in RStudio, and it will fail after calling setwd().
Other solutions (such as the ‘here’ package) rely on heuristics and specific project structures.
But luckily there’s actually a solution that will always work: use ‘box’ modules.
With modules, you’ll always be able to get the path of the current script/module via box::file(). This is the closest equivalent to Python’s __file__ you’ll get in R, and it always works — as long as you’re using ‘box’ modules consistently.
(Internally the ‘box’ package requires complex logic to determine the value of the file() function in all circumstances; I don’t recommend replicating it, it’s too complex. For the curious, the bulk of the relevant logic is in R/loaded.r.)
If you are running the script using Rscript you can use getwd().
#!/usr/bin/Rscript
getwd()
# or assign it to a variable
base_dir = getwd()
you can run it from the command line using one of the following
./yourscript.R
# or
Rscript yourscript.R
Note however, this only works if you run the script from inside the folder, the file is in.
cd ~
./script.R
# "/home/usr1"
cd /
/home/usr1/script.R
# "/"
For a more elaborate option you could consider https://stackoverflow.com/a/55322344/3250126

How can I set `path.expand` to begin at my working directory?

I'm using a Mac. The path.expand function is several folders removed from my desired working directory. For example:
path.expand('~')
[1] "/Users/my.name"
I'd like to change it to something like this:
path.expand('~')
[1] "/Users/my.name/drive/R/project/sub.folder"
How can I go about this?
Thank you.
The tilde is, in all unix-sen (including macos), special in that it refers to what the operating system considers the home directory (via the env var HOME).
There are two types of answers to this. Can it be done? Perhaps, sure even. Should it be done? There will likely be unintended consequences (that may be hard to troubleshoot and/or workaround), so likely not.
This works on my ubuntu box:
me#mybox:/some/path$ Rscript -e 'Sys.getenv("HOME")'
[1] "/home/me"
me#mybox:/some/path$ HOME=/tmp/ Rscript -e 'Sys.getenv("HOME")'
[1] "/tmp/"
me#mybox:/some/path$ Rscript -e 'Sys.setenv(HOME="/tmp/");Sys.getenv("HOME")'
[1] "/tmp/"
(This notably does not work as well on Windows ... which is not very unix-y of it!)
So you can try overriding it with either:
Sys.setenv(HOME = "/Users/my.name/drive/R/project/sub.folder"), or
Set the HOME variable in your working environment before starting R.
This might have unintended consequences. For instance, R looks for ~/.Rprofile, and git and commands look for ~/.gitconfig and such.
My recommended way-ahead would be to define a variable and change there. If you use RStudio, then its "Projects" can always start you in the correct directory. If not and you still want this "special directory" available to you, perhaps add this to your /Users/username/.Rprofile (in your "actual" homedir)
.specialdir <- "/Users/my.name/drive/R/project/sub.folder"
and, whenever you need to go there, use file.expand(.specialdir). One side-effect of this is that any of your code, functions, reports, whatever that use this will no longer be reproducible.
A way to easily reference your files without needing to change the HOME directory is to use the here package. This basically uses a heuristic to find the right working directory based on where your script is. Normally it looks for RStudio Project files (.rproj) or for a .git file if your working directory is a git repository. It's easy to use and robust to moving machines or accidental use of setwd, or even forgetting to set HOME on a different machine/profile.
If your data file some_data.csv above is stored in /Users/my.name/drive/R/project/sub.folder/some_data.csv, where project is the root folder for the project:
here::here()
[1] "/Users/my.name/drive/R/project"
here::here("sub.folder", "some_data.csv")
[1] "/Users/my.name/drive/R/project/sub.folder/some_data.csv"
and you can use it as a drop in replacement for the path, as in:
data <- read_csv(here::here("sub.folder", "some_data.csv"))

What is the equivalent to R_HISTFILE for R data files

.RData files are starting to invade my directory structure. I would like to retain a single one in a specified directory. Is there such an ENV variable similar to R_HISTFILE?
This is in reference to the default save/restore directory for R workspaces.
UPDATE The answer by JThorpe led to the following solution:
set Env var RPROFILE_USER to a desired location . I am using my home dir
i.e.:
export RPROFILE_USER=/Users/steve
In that directory create a file with setwd (set working directory)
i.e:
$cat ~/.Rprofile
setwd('/Users/steve')
Now the .RData will always load/save to the home dir (or whatever dir you put in setwd)
I personally dislike having R retain anything between sessions b/c it makes for difficult to find errors owing to variables that persist between sessions. Hence I set the “no-save” and “no-restore” options so that R neither writes its current state to an .Rdata file nor attempts to read in an old state. If I do happen to want to save an R session (this happens VERY rarely) I call savehistory().
Methods for setting command line options in OSX can be found here, and what follows describes setting command line options for R (or any other program) in windows.
To set the no-save and no-restore options in Windows, right-click on the R icon that you use to start an R session and select the ‘properties’ option. In the properties box, the “target” string should look something like this:
“C:\Program Files\R\R-3.1.2\bin\i386\Rgui.exe”
To this string, add this string ‘ --no-save --no-restore’. Note that there is a space before each of the double-dashes. The target should now look something like this:
“C:\Program Files\R\R-3.1.2\bin\i386\Rgui.exe” --no-save --no-restore
Click ‘Ok’ or ‘Apply’ to save these options. Note that these are per-icon (shortcut) settings. I have several icons with different command line options depending on the setting I want in the R session. Additional command line arguments to R can be found here.
Try the Load command in the Hmisc package. It uses the LoadPath option for this.
If you create a .Rprofile file you can specify a default working directory. See ?Startup for more details including the ability to set a site-wide profile. I just referred to that help page to make sure the .RData would be affected by that setting.

get filename and path of `source`d file

How can a sourced or Sweaved file find out its own path?
Background:
I work a lot with .R scripts or .Rnw files.
My projects are organized in a directory structure, but the path of the project's base directory frequently varies between different computers (e.g. because I just do parts of data analysis for someone else, and their directory structure is different from mine: I have projects base directories ~/Projects/StudentName/ or ~/Projects/Studentname/Projectname and most students who have just their one Project usually have it under ~/Measurements/ or ~/DataAnalysis/ or something the like - which wouldn't work for me).
So a line like
setwd (my.own.path ())
would be incredibly useful as it would allow to ensure the working directory is the base path of the project regardless of where that project actually is. Without the need that the user must think of setting the working directory.
Let me clarify: I look for a solution that works with pressing the editor's/IDE's source or Sweave Keyboard shortcut of the unthinking user.
Just FYI, knitr will setwd() to the dir of the input file when (and only when) evaluating the code chunks, i.e. if you call knit('path/to/input.Rnw'), the working dir will be temporarily switched to path/to/. If you want to know the input dir in code chunks, currently you can call an unexported function knitr:::input_dir() (I may export it in the future).
Starting from gsk3's Seb's suggestions, here's an idea:
the combination of username (login) and IP or name of the computer could be used to select the right directory.
That leads to something like:
setwd (switch (paste (Sys.info () [c ("user", "nodename")], collapse="."),
user.laptop = "~/Messungen",
user2.server = "~/Projekte/Projekt/",
))
So there is an automatic solution, that
works with source
works with Sweave
even works for interactive sessions where the commands are sent line by line
the combination of user and nodename of course needs to be specific
the paths need to be edited by hand, though.
Improvements welcome!
Update:
Gabor Grothendieck answered the following to a related question on r-help today:
this.dir <- dirname(parent.frame(2)$ofile)
setwd(this.dir)
which will work for source.
Another update: I now do most of the data analysis work in RStudio. RStudio's projects basically solve the problem: RStudio changes the working directory to the project root directory every time I switch between projects.
I can therefore put the project directory as far down my directory tree as I want (and the students can also put their copy wherever they want) and sync the data files and scripts/.Rnws via version control (We use a private git server). The RStudio project files are kept out of the version control, i.e. .gitignore contains .Rproj.user.
Obviously, within the project, the directory structure needs to be synchronized.
You can use sys.calls() to get the command used to source the file. Then you need a bit of trickery using regular expressions to get the pathname, bearing in mind that source("something/filename") could have used either the absolute or relative path. Here's a first attempt at putting all the pieces together: try inserting the following lines at the top of a source file.
whereFrom=sys.calls()[[1]]
# This should be an expression that looks something like
# source("pathname/myfilename.R")
whereFrom=as.character(whereFrom[2]) # get the pathname/filename
whereFrom=paste(getwd(),whereFrom,sep="/") # prefix it with the current working directory
pathnameIndex=gregexpr(".*/",whereFrom) # we want the string up to the final '/'
pathnameLength=attr(pathnameIndex[[1]],"match.length")
whereFrom=substr(whereFrom,1,pathnameLength-1)
print(whereFrom) # or "setwd(whereFrom)" to set the working directory
It's not very robust—for instance, it will fail on windows with source("pathname\\filename"), and I haven't tested what happens if you have one file sourcing another file—but you might be able to build a solution on top of this.
I have no direct solution how to obtain the directory of the file itself but if you have a limited range of directories and directory structures you can probably use
if(file.exists("c:/somedir")==TRUE){setwd("c:/somedir")}
You could check out the pattern of the directory in question and then set the dir. Does this help you?
An additional problem is that the working directory is a global variable, which can be changed by any script, so if your script calls another script, it will have to set the wd back. In RStudio I use Session -> Set Working Directory -> To Source File Location (I know, it's not ideal), and then my script does
wd = getwd ()
...
source ("mySubDir/myOtherScript.R", chdir=TRUE); setwd (wd)
...
source ("anotherSubDir/anotherScript.R", chdir=TRUE); setwd (wd)
In this way one can maintain a stack of working directories. I would love to see this implemented in the language itself.
This answer works for source and also inside nvim-R - I have no idea if it works with knitr and similar things. Any feedback appreciated.
If you have multiple scripts source-ing each other, it is important to get the correct one. That is, the largest i for which sys.frame(i)$ofile exists.
get.full.path.to.this.sourced.script = function() {
for(i in sys.nframe():1) { # Go through all the call frames,
# in *reverse* order.
x = sys.frame(i)$ofile
if(!is.null(x)) # if $ofile exists,
return(normalizePath(x)) # then return the full absolute path
}
}

Resources