Why is on.exit not allowed in examples on CRAN? - r

I submitted a package to CRAN with the following setup code for a temp file, within an example in the documentation:
#' conn <- DBI::dbConnect(RSQLite::SQLite(), tmp <- tempfile())
#' on.exit(DBI::dbDisconnect(conn), add=TRUE)
#' on.exit(file.remove(tmp), add=TRUE)
I received the form letter back:
Please make sure that you do not change the user's options, par or
working directory. If you really have to do so within functions, please
ensure with an *immediate* call of on.exit() that the settings are reset
when the function is exited. e.g.:
...
old <- options() # code line i
on.exit(options(old)) # code line i+1
...
options(digits = 3)
...
If you're not familiar with the function, please check ?on.exit. This
function makes it possible to restore options before exiting a function
even if the function breaks. Therefore it needs to be called immediately
after the option change within a function.
Please always make sure to reset to user's options(), working directory
or par() after you changed it in examples and vignettes and demos.
e.g.:
old <- options(digits = 3)
...
options(old)
However, if I set a breakpoint using debug(file.remove) and run it with example(myfun) I can clearly see the temporary file get deleted appropriately.
What is the rationale behind this request?

Related

retrieve original version of package function even if over-assigned

Suppose I replace a function of a package, for example knitr:::sub_ext.
(Note: I'm particularly interested where it is an internal function, i.e. only accessible by ::: as opposed to ::, but the same answer may work for both).
library(knitr)
my.sub_ext <- function (x, ext) {
return("I'm in your package stealing your functions D:")
}
# replace knitr:::sub_ext with my.sub_ext
knitr <- asNamespace('knitr')
unlockBinding('sub_ext', knitr)
assign('sub_ext', my.sub_ext, knitr)
lockBinding('sub_ext', knitr)
Question: is there any way to retrieve the original knitr:::sub_ext after I've done this? Preferably without reloading the package?
(I know some people want to know why I would want to do this so here it is. Not required reading for the question). I've been patching some functions in packages like so (not actually the sub_ext function...):
original.sub_ext <- knitr:::sub_ext
new.sub_ext <- function (x, ext) {
# some extra code that does something first, e.g.
x <- do.something.with(x)
# now call the original knitr:::sub_ext
original.sub_ext(x, ext)
}
# now set knitr:::sub_ext to new.sub_ext like before.
I agree this is not in general a good idea (in most cases these are quick fixes until changes make their way into CRAN, or they are "feature requests" that would never be approved because they are somewhat case-specific).
The problem with the above is if I accidentally execute it twice (e.g. it's at the top of a script that I run twice without restarting R in between), on the second time original.sub_ext is actually the previous new.sub_ext as opposed to the real knitr:::sub_ext, so I get infinite recursion.
Since sub_ext is an internal function (I wouldn't call it directly, but functions from knitr like knit all call it internally), I can't hope to modify all the functions that call sub_ext to call new.sub_ext manually, hence the approach of replacing the definition in the package namespace.
When you do assign('sub_ext', my.sub_ext, knitr), you are irrevocably overwriting the value previously associated with sub_ext with the value of my.sub_ext. If you first stash the original value, though, it's not hard to reset it when you're done:
library(knitr)
knitr <- asNamespace("knitr")
## Store the original value of sub_ext
.sub_ext <- get("sub_ext", envir = knitr)
## Overwrite it with your own function
my.sub_ext <- function (x, ext) "I'm in your package stealing your functions D:"
assignInNamespace('sub_ext', my.sub_ext, knitr)
knitr:::sub_ext("eg.csv", "pdf")
# [1] "I'm in your package stealing your functions D:"
## Reset when you're done
assignInNamespace('sub_ext', .sub_ext, knitr)
knitr:::sub_ext("eg.csv", "pdf")
# [1] "eg.pdf"
Alternatively, as long as you are just adding lines of code to what's already there, you could add that code using trace(). What's nice about trace() is that, when you are done, you can use untrace() to revert the function's body to its original form:
trace(what = "mean.default",
tracer = quote({
a <- 1
b <- 2
x <- x*(a+b)
}),
at = 1)
mean(1:2)
# Tracing mean.default(1:2) step 1
# [1] 4.5
untrace("mean.default")
# Untracing function "mean.default" in package "base"
mean(1:2)
# [1] 1.5
Note that if the function you are tracing is in a namespace, you'll want to use trace()'s where argument, passing it the name of some other (exported) function that shares the to-be-traced function's namespace. So, to trace an unexported function in knitr's namespace, you could set where=knit

Save package settings between sessions

Is there a definitive way to save options or information pertaining to a certain package between sessions?
For example say somebody made a game and released it as an R package. If they wanted to save high scores and not have them reset each time R started a new session what would be the best way to do this? Currently I can only think of storing a file in the users home directory but I'm not sure if I like that approach.
This may be an approach. I created a dummy package with a dummy function (any function I create is bound to be a dummy function) and a data set I called scores that I set as follows:
scores <- NA
Then I created the package with the scores data set.
Then I used the following to change the data set from within R.
loc <- paste0(find.package("new"), "/Data")
unlink(paste0(loc, "/scores.rda"), recursive = TRUE, force = FALSE)
scores <- 10
save(scores, file=paste0(loc, "/scores.rda"))
Then when I unloaded the library and re loaded agin the data set now says:
> scores
[1] 10
Could this be modified to do what you want? You'd have to have it save in between somehow but am not sure on how to do this without messing with .Last function.
EDIT:
It appears this option is not viable in that when you compile as a package and use lazy load it saves the data sets as:
RData.rbd, RData.rbx, not as .rda files. That means the approach I use above is kinda worthless in that we want it to automatically be recognized.
EDIT2
This approach works and I tried it on a package I made. You can't do lazy load of the data and you have to either explicitly use data(scores) or use data(scores) inside of the function you're calling. I also assigned scores to .scores int he global.env the first time it was created and used exists inside the function to see if it exists. If `.scores. existed I assigned that to scores within the function. Once you unload the library and laod again you never have to worry about that again.
Maybe an alternative is to save this as a function somehow that can be altered using Josh's advice here: Permanently replacing a function
I guess there is no way to store settings without saving them to disk or a database, some way or another. It can be done silently though by putting the code below in your ~/.Rprofile. However, if you have packages that save settings in other ways than using options you need to add them manually.
I know this is exactly what you said you did not want, but it might spark some debate at least.
.Last <- function(){
my.options <- options()
save(my.options, file="~/.Roptions.Rdata")
}
.First <- function(){
tryCatch({
load("~/.Roptions.Rdata")
do.call(options, my.options)
rm(my.options)
}, error=function(...){})
}
To my suprise try(..., silent=TRUE) gives a warning on startup if ~/.Roptions.Rdata does not exist, which is why I used tryCatch instead.
The modern answer to this problem is well explained at https://blog.r-hub.io/2020/03/12/user-preferences/
I think I will be trying the hoardr package! Here is an example that worked for me :)
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$mkdir()
scores<-data.frame(
user=c("one","two","three"),
score=c("500,200,1100")
)
save(scores,file = file.path(x$cache_path_get(), "scores.rdata"))
x$list()
x$details()
#new session
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$list()
x$details()
load(file = file.path(x$cache_path_get(), "scores.rdata"))
PS - you can see a working example in the rnoaa package found on at github "opensci/rnoaa". Check their R/onload.r file! I can expand if needed.

How to preserve changes to function with fix() between R sessions?

If I edit a function with R v2.14.0 using fix(), those fixes are applied during the session.
For example, I might make the following edit to get a white background in a hive plot:
> library(HiveR)
> fix(plotHive)
... :%s/black/white/g
... :w
... :q
> plotHive(myHiveData)
I then get a white background in the hive plot, as expected.
But if I quit and reopen R, I have lost those changes, and the plot has a black background again.
How do I preserve the edits I make with fix() between R sessions?
EDIT
If I source() the modified plotHive() function, I get the following error:
> modifiedPlotHive <- source("modifiedPlotHive.R")
Error in source("modifiedPlotHive.R") :
modifiedPlotHive.R:1160:1: unexpected '<'
1159: }
1160: <
^
In addition: Warning message:
In readLines(file) : incomplete final line found on 'modifiedPlotHive.R'
The final line in the modified plotHive() function is:
<environment: namespace:HiveR>
If I remove this line before source()-ing, then the function no longer works.
Sorry I missed this when it came out, but the latest version of HiveR has the option to control the background color (available on CRAN 0.2-1) Bryan
Here's the safer way of doing what you want, referenced by #joran.
The sink/source pair is fine for dealing with R code files. But saving to text files and then reading back in other types of objects can strip them of important attributes, especially those relating to environments. That's what you just experienced.
The save/load pair stores objects in R's own binary format, so is much less liable to lose important information/environments attached to functions.
In this example, I define a personal version of ls, which differs from the base function in that it by default lists objects that start with a dot/period:
my_ls <- ls
fix(my_ls)
# 1) On the first line, change 'all.names=FALSE' to 'all.names=TRUE'
# 2) Say "Yes", I want to save the changes
save("my_ls", file="my_ls.Rdata")
# Then, in a later session, test that it works
load("my_ls.Rdata")
.TrysToHide <- 99
my_ls()
# [1] ".TrysToHide" "my_ls"
One more note: it's much cleaner to give your modified function a name of its own. To really edit a packaged function, and have the changes persist, you'd need to edit the sources and recompile the package. But if you do that, beware, as you may well break the function for other packaged functions that depend on it.
There are a couple of options:
Save your workspace before quiting and load it again when you reopen R.
Save the modified function to script file and source it:
sink("modified_plotHive.r")
plotHive
sink()
In the next session:
plotHive <- source("modified_plotHive.r")
HTH

How to make "resident folder" to be the working directory? [duplicate]

Is there a way to programmatically find the path of an R script inside the script itself?
I am asking this because I have several scripts that use RGtk2 and load a GUI from a .glade file.
In these scripts I am obliged to put a setwd("path/to/the/script") instruction at the beginning, otherwise the .glade file (which is in the same directory) will not be found.
This is fine, but if I move the script in a different directory or to another computer I have to change the path. I know, it's not a big deal, but it would be nice to have something like:
setwd(getScriptPath())
So, does a similar function exist?
This works for me:
getSrcDirectory(function(x) {x})
This defines an anonymous function (that does nothing) inside the script, and then determines the source directory of that function, which is the directory where the script is.
For RStudio only:
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
This works when Running or Sourceing your file.
Use source("yourfile.R", chdir = T)
Exploit the implicit "--file" argument of Rscript
When calling the script using "Rscript" (Rscript doc) the full path of the script is given as a system parameter. The following function exploits this to extract the script directory:
getScriptPath <- function(){
cmd.args <- commandArgs()
m <- regexpr("(?<=^--file=).+", cmd.args, perl=TRUE)
script.dir <- dirname(regmatches(cmd.args, m))
if(length(script.dir) == 0) stop("can't determine script dir: please call the script with Rscript")
if(length(script.dir) > 1) stop("can't determine script dir: more than one '--file' argument detected")
return(script.dir)
}
If you wrap your code in a package, you can always query parts of the package directory.
Here is an example from the RGtk2 package:
> system.file("ui", "demo.ui", package="RGtk2")
[1] "C:/opt/R/library/RGtk2/ui/demo.ui"
>
You can do the same with a directory inst/glade/ in your sources which will become a directory glade/ in the installed package -- and system.file() will compute the path for you when installed, irrespective of the OS.
This answer works fine to me:
script.dir <- dirname(sys.frame(1)$ofile)
Note: script must be sourced in order to return correct path
I found it in: https://support.rstudio.com/hc/communities/public/questions/200895567-can-user-obtain-the-path-of-current-Project-s-directory-
But I still don´t understand what is sys.frame(1)$ofile. I didn´t find anything about that in R Documentation. Someone can explain it?
#' current script dir
#' #param
#' #return
#' #examples
#' works with source() or in RStudio Run selection
#' #export
z.csd <- function() {
# http://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# must work with source()
if (!is.null(res <- .thisfile_source())) res
else if (!is.null(res <- .thisfile_rscript())) dirname(res)
# http://stackoverflow.com/a/35842176/2292993
# RStudio only, can work without source()
else dirname(rstudioapi::getActiveDocumentContext()$path)
}
# Helper functions
.thisfile_source <- function() {
for (i in -(1:sys.nframe())) {
if (identical(sys.function(i), base::source))
return (normalizePath(sys.frame(i)$ofile))
}
NULL
}
.thisfile_rscript <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
cmdArgsTrailing <- commandArgs(trailingOnly = TRUE)
cmdArgs <- cmdArgs[seq.int(from=1, length.out=length(cmdArgs) - length(cmdArgsTrailing))]
res <- gsub("^(?:--file=(.*)|.*)$", "\\1", cmdArgs)
# If multiple --file arguments are given, R uses the last one
res <- tail(res[res != ""], 1)
if (length(res) > 0)
return (res)
NULL
}
A lot of these solutions are several years old. While some may still work, there are good reasons against utilizing each of them (see linked source below). I have the best solution (also from source): use the here library.
Original example code:
library(ggplot2)
setwd("/Users/jenny/cuddly_broccoli/verbose_funicular/foofy/data")
df <- read.delim("raw_foofy_data.csv")
Revised code
library(ggplot2)
library(here)
df <- read.delim(here("data", "raw_foofy_data.csv"))
This solution is the most dynamic and robust because it works regardless of whether you are using the command line, RStudio, calling from an R script, etc. It is also extremely simple to use and is succinct.
Source: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
I have found something that works for me.
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
How about using system and shell commands? With the windows one, I think when you open the script in RStudio it sets the current shell directory to the directory of the script. You might have to add cd C:\ e.g or whatever drive you want to search (e.g. shell('dir C:\\*file_name /s', intern = TRUE) - \\ to escape escape character). Will only work for uniquely named files unless you further specify subdirectories (for Linux I started searching from /). In any case, if you know how to find something in the shell, this provides a layout to find it within R and return the directory. Should work whether you are sourcing or running the script but I haven't fully explored the potential bugs.
#Get operating system
OS<-Sys.info()
win<-length(grep("Windows",OS))
lin<-length(grep("Linux",OS))
#Find path of data directory
#Linux Bash Commands
if(lin==1){
file_path<-system("find / -name 'file_name'", intern = TRUE)
data_directory<-gsub('/file_name',"",file_path)
}
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name /s', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Directory of ","",file_path)
filepath<-gsub("\\\\","/",file_path)
data_directory<-file_path
}
#Change working directory to location of data and sources
setwd(data_directory)
Thank you for the function, though I had to adjust it a Little as following for me (W10):
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Verzeichnis von ","",file_path)
file_path<-chartr("\\","/",file_path)
data_directory<-file_path
}
In my case, I needed a way to copy the executing file to back up the original script together with its outputs. This is relatively important in research. What worked for me while running my script on the command line, was a mixure of other solutions presented here, that looks like this:
library(scriptName)
file_dir <- gsub("\\", "/", fileSnapshot()$path, fixed=TRUE)
file.copy(from = file.path(file_dir, scriptName::current_filename()) ,
to = file.path(new_dir, scriptName::current_filename()))
Alternatively, one can add to the file name the date and our to help in distinguishing that file from the source like this:
file.copy(from = file.path(current_dir, current_filename()) ,
to = file.path(new_dir, subDir, paste0(current_filename(),"_", Sys.time(), ".R")))
None of the solutions given so far work in all circumstances. Worse, many solutions use setwd, and thus break code that expects the working directory to be, well, the working directory — i.e. the code that the user of the code chose (I realise that the question asks about setwd() but this doesn’t change the fact that this is generally a bad idea).
R simply has no built-in way to determine the path of the currently running piece of code.
A clean solution requires a systematic way of managing non-package code. That’s what ‘box’ does. With ‘box’, the directory relative to the currently executing code can be found trivially:
box::file()
However, that isn’t the purpose of ‘box’; it’s just a side-effect of what it actually does: it implements a proper, modern module system for R. This includes organising code in (nested) modules, and hence the ability to load code from modules relative to the currently running code.
To load code with ‘box’ you wouldn’t use e.g. source(file.path(box::file(), 'foo.r')). Instead, you’d use
box::use(./foo)
However, box::file() is still useful for locating data (i.e. OP’s use-case). So, for instance, to locate a file mygui.glade from the current module’s path, you would write.
glade_path = box::file('mygui.glade')
And (as long as you’re using ‘box’ modules) this always works, doesn’t require any hacks, and doesn’t use setwd.

Getting path of an R script

Is there a way to programmatically find the path of an R script inside the script itself?
I am asking this because I have several scripts that use RGtk2 and load a GUI from a .glade file.
In these scripts I am obliged to put a setwd("path/to/the/script") instruction at the beginning, otherwise the .glade file (which is in the same directory) will not be found.
This is fine, but if I move the script in a different directory or to another computer I have to change the path. I know, it's not a big deal, but it would be nice to have something like:
setwd(getScriptPath())
So, does a similar function exist?
This works for me:
getSrcDirectory(function(x) {x})
This defines an anonymous function (that does nothing) inside the script, and then determines the source directory of that function, which is the directory where the script is.
For RStudio only:
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
This works when Running or Sourceing your file.
Use source("yourfile.R", chdir = T)
Exploit the implicit "--file" argument of Rscript
When calling the script using "Rscript" (Rscript doc) the full path of the script is given as a system parameter. The following function exploits this to extract the script directory:
getScriptPath <- function(){
cmd.args <- commandArgs()
m <- regexpr("(?<=^--file=).+", cmd.args, perl=TRUE)
script.dir <- dirname(regmatches(cmd.args, m))
if(length(script.dir) == 0) stop("can't determine script dir: please call the script with Rscript")
if(length(script.dir) > 1) stop("can't determine script dir: more than one '--file' argument detected")
return(script.dir)
}
If you wrap your code in a package, you can always query parts of the package directory.
Here is an example from the RGtk2 package:
> system.file("ui", "demo.ui", package="RGtk2")
[1] "C:/opt/R/library/RGtk2/ui/demo.ui"
>
You can do the same with a directory inst/glade/ in your sources which will become a directory glade/ in the installed package -- and system.file() will compute the path for you when installed, irrespective of the OS.
This answer works fine to me:
script.dir <- dirname(sys.frame(1)$ofile)
Note: script must be sourced in order to return correct path
I found it in: https://support.rstudio.com/hc/communities/public/questions/200895567-can-user-obtain-the-path-of-current-Project-s-directory-
But I still don´t understand what is sys.frame(1)$ofile. I didn´t find anything about that in R Documentation. Someone can explain it?
#' current script dir
#' #param
#' #return
#' #examples
#' works with source() or in RStudio Run selection
#' #export
z.csd <- function() {
# http://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# must work with source()
if (!is.null(res <- .thisfile_source())) res
else if (!is.null(res <- .thisfile_rscript())) dirname(res)
# http://stackoverflow.com/a/35842176/2292993
# RStudio only, can work without source()
else dirname(rstudioapi::getActiveDocumentContext()$path)
}
# Helper functions
.thisfile_source <- function() {
for (i in -(1:sys.nframe())) {
if (identical(sys.function(i), base::source))
return (normalizePath(sys.frame(i)$ofile))
}
NULL
}
.thisfile_rscript <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
cmdArgsTrailing <- commandArgs(trailingOnly = TRUE)
cmdArgs <- cmdArgs[seq.int(from=1, length.out=length(cmdArgs) - length(cmdArgsTrailing))]
res <- gsub("^(?:--file=(.*)|.*)$", "\\1", cmdArgs)
# If multiple --file arguments are given, R uses the last one
res <- tail(res[res != ""], 1)
if (length(res) > 0)
return (res)
NULL
}
A lot of these solutions are several years old. While some may still work, there are good reasons against utilizing each of them (see linked source below). I have the best solution (also from source): use the here library.
Original example code:
library(ggplot2)
setwd("/Users/jenny/cuddly_broccoli/verbose_funicular/foofy/data")
df <- read.delim("raw_foofy_data.csv")
Revised code
library(ggplot2)
library(here)
df <- read.delim(here("data", "raw_foofy_data.csv"))
This solution is the most dynamic and robust because it works regardless of whether you are using the command line, RStudio, calling from an R script, etc. It is also extremely simple to use and is succinct.
Source: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
I have found something that works for me.
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
How about using system and shell commands? With the windows one, I think when you open the script in RStudio it sets the current shell directory to the directory of the script. You might have to add cd C:\ e.g or whatever drive you want to search (e.g. shell('dir C:\\*file_name /s', intern = TRUE) - \\ to escape escape character). Will only work for uniquely named files unless you further specify subdirectories (for Linux I started searching from /). In any case, if you know how to find something in the shell, this provides a layout to find it within R and return the directory. Should work whether you are sourcing or running the script but I haven't fully explored the potential bugs.
#Get operating system
OS<-Sys.info()
win<-length(grep("Windows",OS))
lin<-length(grep("Linux",OS))
#Find path of data directory
#Linux Bash Commands
if(lin==1){
file_path<-system("find / -name 'file_name'", intern = TRUE)
data_directory<-gsub('/file_name',"",file_path)
}
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name /s', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Directory of ","",file_path)
filepath<-gsub("\\\\","/",file_path)
data_directory<-file_path
}
#Change working directory to location of data and sources
setwd(data_directory)
Thank you for the function, though I had to adjust it a Little as following for me (W10):
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Verzeichnis von ","",file_path)
file_path<-chartr("\\","/",file_path)
data_directory<-file_path
}
In my case, I needed a way to copy the executing file to back up the original script together with its outputs. This is relatively important in research. What worked for me while running my script on the command line, was a mixure of other solutions presented here, that looks like this:
library(scriptName)
file_dir <- gsub("\\", "/", fileSnapshot()$path, fixed=TRUE)
file.copy(from = file.path(file_dir, scriptName::current_filename()) ,
to = file.path(new_dir, scriptName::current_filename()))
Alternatively, one can add to the file name the date and our to help in distinguishing that file from the source like this:
file.copy(from = file.path(current_dir, current_filename()) ,
to = file.path(new_dir, subDir, paste0(current_filename(),"_", Sys.time(), ".R")))
None of the solutions given so far work in all circumstances. Worse, many solutions use setwd, and thus break code that expects the working directory to be, well, the working directory — i.e. the code that the user of the code chose (I realise that the question asks about setwd() but this doesn’t change the fact that this is generally a bad idea).
R simply has no built-in way to determine the path of the currently running piece of code.
A clean solution requires a systematic way of managing non-package code. That’s what ‘box’ does. With ‘box’, the directory relative to the currently executing code can be found trivially:
box::file()
However, that isn’t the purpose of ‘box’; it’s just a side-effect of what it actually does: it implements a proper, modern module system for R. This includes organising code in (nested) modules, and hence the ability to load code from modules relative to the currently running code.
To load code with ‘box’ you wouldn’t use e.g. source(file.path(box::file(), 'foo.r')). Instead, you’d use
box::use(./foo)
However, box::file() is still useful for locating data (i.e. OP’s use-case). So, for instance, to locate a file mygui.glade from the current module’s path, you would write.
glade_path = box::file('mygui.glade')
And (as long as you’re using ‘box’ modules) this always works, doesn’t require any hacks, and doesn’t use setwd.

Resources