source() functions preventing the read of downstream functions - r

I am trying to teach myself R and coming from a Python programming background.
I am clearly having problems with sourcing in one file (file_read_functions.R) when the functions stored in it are called from a file in the same directory (from read_files.R).
file_read_functions is as follows:
constant_source <- 'constants.R'
function_source <- 'file_read_functions.R'
class_source <- 'classes.R'
source(class_source)
source(constant_source)
source(function_source)
cellecta_counts = read_cellecta_counts(filepath = cell_counts_by_gene_id)
file_read_functions.R is as follows:
constants <- 'constants.R'
classes <- 'classes.R'
assignments <- 'assignment_functions.R'
source(constants)
source(classes)
source(assignments)
read_cellecta_counts = function(filepath) {
print("hello")
return(filepath)}
With the above, if I move read_cellecta_counts to before the source functions, the code can successfully find the function. What might be the cause?

This seems like a straight-forward error message to me. The function object wasn't found, so that means you haven't defined it anywhere, or haven't loaded it.
If it's a function from a package, maybe you forgot to load the package, or call the function as package::function(). If it is a function you wrote as a simple script, maybe you forgot to source it or define it locally. If it's a function you wrote as part of a package, you can load all functions by using the shortcut CTRL+SHIFT+L in RStudio.
That said, I believe you can benefit a lot from reading the chapter on Debugging from Hadley Wickham's "Advanced R" book. It is really well written and easy to understand, especially for beginners in the R language. The chapter will teach you how to use some debugging tools, either interactively or not. You can find it here.

I found it out. By commenting out the parts piece-wise in file_read_functions, I found that there a typo within assignment_functions.R. It was a simple typo; I had the following:
constant_source <- 'constants.R'
source(constants_source)
The extra 's' did it.
If you get an error like this in the future, be sure to check all of the upstream source() files; if there is an error in any of those it will not be able to find any proceeding functions.
Thank you all for your patience.

Related

Determining if there are unused packages in an R script [duplicate]

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

R function example requires nonstandard dataset, doesn't jive with devtools

I've been struggling to get the example code for a function working using devtools::check(), because the data required for the example is not in .RData format. Unfortunately, the way the function is written, .RData cannot be loaded and work properly. The function takes in a list of filenames and performs an action on them collectively.
Therefore, example code must be written in a way that check() is able to access a folder and list the files therein. Using the function on my own computer, I input
setwd("/Users/mydirectory")
myfilelist <- list.files(pattern = "mypattern")
output <- myfunction(myfilelist, ...)
and everything is groovy. But this doesn't work with devtools because #examples doesn't know how to access subdirectories on my computer. check() pulls the following error:
base::assign(".ptime", proc.time(), pos = "CheckExEnv")
This is almost undoubtedly because check() doesn't know where to look for the data. I'd like it to look toward github to access the online data repository.
I found this brief conversation regarding a similar roxygen-related problem, but overall I haven't seen much advice on how to work through it. I think that perhaps this issue starts to get a little closer to my situation, but here the user failed to export a function, rather than bind data to an example.
I don't think I'm looking for a pull function (though the end goal is to pull data...), does anyone have advice moving forward? I have the data stored in the inst/extdata folder on github, so while I don't really have something reproducible for you all I'm hoping you might have some thoughts.
Edit: I worked around the problem using #alistaire's advice below, and guiding the roxygen to the package directory (updated on github) and also using \dontrun{}. However, I am leaving the question unanswered for now because I think accessing data stored in github should still be somehow possible and we haven't yet addressed that.

how to use utils::globalVariables

Following your recommendations (or trying to do it, at least), I have tried some options, but the problem remains, so there must be something I am missing.
I have included a more complete code
setwd("C:/naapp")
#' #import utils
#' #import devtools
I have tried with and without using suppressForeignCheck
if(getRversion() >= "2.15.1"){
utils::globalVariables(c("eleven"))
utils::suppressForeignCheck(c("eleven"))
}
myFunctionSum <- function(X){print(X+eleven)}
myFunctionMul <- function(X){print(X*eleven)}
myFunction11 <- function(X){
assign("eleven",11,envir=environment(myFunctionMul))
}
maybe I should use a particular environment?
package.skeleton(name = "myPack11", list=ls(),
path = "C:/naapp", force = TRUE,
code_files = character())
I remove the "man" directory from the directory myPack11,
otherwise I would get an error because the help files are empty.
I add the imports utils, and devtools to the descrption
Then I run check
devtools::check("myPack11")
And I still get this note
#checking R code for possible problems ... NOTE
#myFunctionMul: no visible binding for global variable 'eleven'
#myFunctionSum: no visible binding for global variable 'eleven'
#Undefined global functions or variables:eleven
I have tried also to make an enviroment, combining Tomas Kalibera's suggetion and an example I found in the Internet.
myEnvir <- new.env()
myEnvir$eleven <- 11
etc
In this case, I get the same note, but with "myEnvir", instead of "eleven"
First version of the question
I trying to use "globalVariables" from the package utils. I am building an interface in R and I am planning to submit to CRAN. This is my first time, so, sorry if the question is very basic.I have read the help and I have tried to find examples, but I still don't know how to use it.
I have made a little silly example to ilustrate my question, which is:
Where do I have to place this line exactly?:
if(getRversion() >= "2.15.1"){utils::globalVariables("eleven")}
My example has three functions. myFunction11 creates the global variable "eleven" and the other two functions manipulate it. In my real code, I cannot use arguments in the functions that are called by means of a button. Consider that this is just a silly example to learn how to use globalVariables (to avoid binding notes).
myFunction11 <- function(){
assign("eleven",11,envir=environment(myFunctionSum))
}
myFunctionSum <- function(X){
print(X+eleven)
}
myFunctionMul <- function(X){
print(X*eleven)
}
Thank you in advance
I thought that the file globals.R would be automatically generated when using globalsVariables. The problem was that I needed to create the package skeleton, then create the file globals.R, add it to the R directory in the package and check the package.
So, I needed to place this in a different file:
#' #import utils
utils::globalVariables(c("eleven"))
and save it
The documentation clearly says:
## In the same source file (to remind you that you did it) add:
if(getRversion() >= "2.15.1") utils::globalVariables(c(".obj1", "obj2"))
so put it in the same source file as your functions. It can go in any of your R source files, but the comment above recommends you put it close to your code. Looking at a bunch of github packages reveals another common pattern is to have a globals.R function with it in, but this is probably a bad idea. If you later remove the global from your package but neglect to update globals.R you could mask a problem. Putting it right close to the functions that use it will hopefully remind you when you edit those functions.
Make sure you put it outside any function definitions in the file, or it won't get seen.
You cannot modify bindings in a package namespace once the package is loaded (and namespace sealed, and bindings locked). The check tool helps you to spot violations of this restriction, so you find out about the problem when checking the package rather than while running it. globalVariables is just a call to silence check when looking for these violations, which is undesirable in almost all cases. If you really need mutable state in a package, you can create a new environment (using new.env) and bind it to an (unexported) "global" variable in your namespace. This binding will be locked, but this is ok, because in R you can change an environment in place (add/remove elements, effectively modifying the elements).
The best situation is however when you can keep all mutable state in user objects (passed in as arguments into functions, and their modified versions returned as output values of functions).

How can I tell which packages I am not using in my R script?

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

Adding a function in R for use without loading it

So I have code for a custom function in R, e.g. simply:
f <- function(x) return(x)
What I want to do is be able to call this function f as if it came with default R. That is, I don't want to have to source() the file containing the code to be able to use the function. Is there a way to accomplish this?
What you are looking to do is documented extensively under ?Startup. Basically, you need to customize your Rprofile.site or .Rprofile to include the functions that you want available on startup. A simple guide can be found at the Quick-R site.
One thing to note: if you are commonly sharing code with others, you do need to remember what you've changed in your startup options and be sure that the relevant functions are also shared.

Resources