How to get roxygenise() to check for duplicate function definitions - r

I am compiling a package using roxygen2. I would like to be able to make sure that there no function is defined twice with the same name. However, currently roxygenise() builds the package without issuing a warning.
E.g.
library(roxygen2)
#' Real function
real_function <- function(){print("hello world")}
#' Fake function
real_function <- function(){}
Calling roxygenise() leads to the second definition being used.

I don't think roxygenise can or should do this. If you want to check yourself for duplicate names, you can e.g. run through the files in a directory and attach each file sequentially. The attach function has a warn.conflicts argument that is TRUE by default.
check_duplicate_names <- function(dir){
files <- list.files(dir)
for (file in file.path(dir, files)){
duplicate_test_env <- new.env()
sys.source(file, envir = duplicate_test_env)
attach(duplicate_test_env)
}
for (i in seq_along(files)){
detach(duplicate_test_env)
}
}
check_duplicate_names("path-to-package/R")
Note that if you have duplicate functions within one file this will not work.

Related

How can I source specific functions in an R script?

I have a script with my most commonly used functions which I source at the top of most scripts. Sometimes I only want to get one of the functions in that script, but I don't know how to indicate that I only want one specific function. I'm looking for a function that is similar to the :: used to get a function inside a package. A reproducible example:
# file a.R
foo <- function() cat("Hello!\n")
bar <- function() cat("Goodbye!\n")
# End of file a.R
# file b.R
# Can't just delete all functions
fun <- function(x) print(x)
fun("It's so late!")
source("a.R")
foo()
fun("See you next time")
# End of file
I read the "source" help and it was unhelpful to me. The solution I currently have is to assign a variable at the start of the script with the functions loaded before, then set the difference with what was there after:
list_before <- lsf.str()
# content of file b.R
new_funcs <- setdiff(lsf.str(),list_before)
Then I can use rm(list=new_funcs[-1]) to keep only the function I wanted. This is, however a very convoluted way of doing this and I was hoping to find an easier solution.
A good way would be to write a package but it requires more knowledge (not there myself).
A good alternative I found is to use the package box that always you to import functions from an R script as a module.
You can import all functions or specific functions.
To set up a function as a module, you would use the roxygen2 documentation syntax as such:
#' This is a function to calculate a sum
#' #export
my_sum <- function(x, y){
x + y
}
#' This is a function to calculate a difference
#' #export
my_diff <- function(x, y){
x - y
}
Save the file as an R script "my_module.R"
The export parameter in the documentation tells box that what follows is a module. Then you can call box to reach a specific function in the module named "my_module".
Let's say your project directory has a script folder that contains your scripts and modules, you would import functions as such:
box::use(script/my_module)
my_module$my_sum(x, y)
box::use() creates an environment that contains all the functions found inside the module.
You can also import single functions like as follows. Let's assume your directory is a bit more complex as well where modules are inside a box folder inside script.
box::use(./script/box/my_module[my_sum])
my_sum(x, y)
You can use box to fetch functions from packages as well. In a sense, it is better than calling library() that would import all the functions in the package.
Using box, you can organize script by objectives or whatever organization you have in place.
I have a script to deal with strings from which I fetch function that work with strings.
I have a script for plot functions that I use in my projects...etc
insertSource() would help.
In your example, let's presume we need to import foo() from a.R :
# file b.R
foo <- function(){}
insertSource("a.R", functions = "foo", force=T)
foo <- foo#.Data

Creating a function that runs even if some files don't import properly

I've got a function to import multiple files. It is shown below:
tucson_function<- function (x) {
df <- read.tucson(x)
final1<-as.data.frame(df)
final2<-rownames_to_column(final1, 'year')
site_ID <- sub('\\.rwl$', '', x)
final2<- cbind(final2, site_ID)
final3<-reshape2::melt(final2)
}
The read.tucson is from a package dplR and used to import files of .rwl extension. I then import the files in the following way:
asia_data<-lapply(asia, tucson_function)
The issue is that the tucson_function fails for several files in the folder. That is fine, but in the current form it stops the rest of the files from uploading (it throws a warning and stops the function). I would like the function to 'ignore' the warning and discard the failing files, and continue importing the rest of the files.
How could I do this?
We can use possibly from purrr or tryCatch from base R. With possibly, specify the value to return in otherwise in case it fails
library(purrr)
ptucson_function <- possibly(tucson_function, otherwise = NA)
map(asia, ptucson_function)
In base R, we can use tryCatch
lapply(asia, function(x) tryCatch(tucson_function(x), error=function(e) NA))

R Script as a Function

I have a long script that involves data manipulation and estimation. I have it setup to use a set of parameters, though I would like to be able to run this script multiple times with different sets of inputs kind of like a function.
Running the script produces plots and saves estimates to a csv, I am not particularly concerned with the objects it creates.
I would rather not wrap the script in a function as it is meant to be used interactively.
How do people go about doing something like this?
I found this for command line arguments : How to pass command-line arguments when source() an R file but still doesn't solve the interactive problem
I have dealt with something similar before. Below is the solution I came up with.
I basically use list2env to push variables to either the global or function's local environment
and I then source the function in the designated environment.
This can be quite useful especially when coupled with exists as shown in the example below which would allow you to keep your script stand-alone.
These two questions may also be of help:
Source-ing an .R script within a function and passing a variable through (RODBC)
How to pass command-line arguments when source() an R file
# Function ----------------------------------------------------------------
subroutine <- function(file, param = list(), local = TRUE, ...) {
list2env(param, envir = if (local) environment() else globalenv())
source(file, local = local, ...)
}
# Example -----------------------------------------------------------------
# Create an example script
tmp <- "test_subroutine.R"
cat("if (!exists('msg')) msg <- 'no argument provided'; print(msg)", file = tmp)
# Example of using exists in the script to keep it stand-alone
subroutine(tmp)
# Evaluate in functions environment
subroutine(tmp, list(msg = "use function's environment"), local = TRUE)
exists("msg", envir = globalenv()) # FALSE
# Evaluate in global environment
subroutine(tmp, list(msg = "use global environment"), local = FALSE)
exists("msg", envir = globalenv()) # TRUE
unlink(tmp)
Just to clarify what was alluded to in Hansi's comment, here is one approach to this issue:
Wrap the script into a function, since this will let you go up one level of abstraction if needed, and will also make it easier to call the function whenever it is needed in any other script.
In cases where you want to use the script interactively, you can put a browser() call somewhere in your script. At the point where browser() is called, the function will pause and keep the environment as-is within the function, and you can then step through the function and use R interactively from within the function.
In the base package, check ?commandArgs, you can use this to parse out arguments from the command line.
If I have a script, test.R, containing the code:
args <- commandArgs(trailingOnly=TRUE)
for (arg in args){
print(arg)
}
and I call it from the command line with rscript as follows:
rscript test.R arg1 arg2 arg3
The output is:
[1] "arg1"
[1] "arg2"
[1] "arg3"

functions in .Rprofile are not found with are in .env

I have a .Rprofile I have copied from https://www.r-bloggers.com/fun-with-rprofile-and-customizing-r-startup/ However, when I load my R session the functions that are in env$ they don't work and the functions not in env works perfectly, here is an example:
sshhh <- function(a.package){
suppressWarnings(suppressPackageStartupMessages(
library(a.package, character.only=TRUE)))
}
auto.loads <-c("dplyr", "ggplot2")
if(interactive()){
invisible(sapply(auto.loads, sshhh))
}
.env <- new.env()
attach(.env)
.env$unrowname <- function(x) {
rownames(x) <- NULL
x
}
.env$unfactor <- function(df){
id <- sapply(df, is.factor)
df[id] <- lapply(df[id], as.character)
df
}
message("n*** Successfully loaded .Rprofile ***n")
Once R is loaded I can type sshhh and it shows the function, but if I type unfactor it shows object not found
Any help? Should I put all the functions on my workspace???
They functions created in a separate environment are intentionally hidden. This is to protect them from calls to rm(list=ls()).
From the original article:
[Lines 58-59]: This creates a new hidden namespace that we can store
some functions in. We need to do this in order for these functions to
survive a call to “rm(list=ls())” which will remove everything in the
current namespace. This is described wonderfully in this blog post [1].
To use the unfactor function, you would call .env$unfactor().
If you want to make those function available in the global namespace without having to refer to .env, you can simply leave out the whole .env part and just add the function the same way you did for the sshhh function.
[1] http://gettinggeneticsdone.blogspot.com.es/2013/07/customize-rprofile.html

How do I load objects to the current environment from a function in R?

Instead of doing
a <- loadBigObject("a")
b <- loadBigObject("b")
I'd like to call a function like
loadBigObjects(list("a","b"))
And be able to access the a and b objects.
It is not clear what loadBigObjects() does or where it will look for a and b. How does it load the objects from file or sourcing code?
There are lots of options in general:
sys.source() allows an R file to be sourced to a given environment
load() which will load an .Rdata file to a given environment
assign() in combination with any object created by loadBigObjects() or a call to readRDS() can also load an object to a given environment.
From within your function, you'll want to specify the environment in which to load objects as the Global Environment by using globalenv(). If you don't do that then the object will only exist in the evaluation frame of the running loadBigObjects(). E.g.
loadBigObjects <- function(list) {
lapply(list, function(x) assign(x, readRDS(x), envir = globalenv()))
}
(as per your comment to #GSee's Answer, and assuming the list("a","b") is sufficient information for readRDS() to locate and open the object.
Without knowing anything about what loadBigObject is or does, you can use lapply to apply a function to a list of objects
lapply(list("a", "b"), loadBigObject)
If you provided the code for loadBigObject or at least describe what it is supposed to do, a better loadBigObjects function could probably be written.
The assign function can be used to define a variable in an environment other than the current one.
loadBigObjects <- function(lst) {
lapply(lst, function(l) {
assign(l, loadBigObject(l), envir=globalenv())
}
lst
}
(Not that this is necessarily a good idea.)

Resources