How to avoid nested function definitions and still use <<- operator - r

Suppose I have these two functions:
fInner<-function()
{
# Do some complicated stuff, and...
I<<-I+1
}
fOuter<-function()
{
I<-0
fInner()
print(I)
}
Calling fOuter() produces an error: Error in fInner() : object 'I' not found. I know, that one way of fixing it is to re-write the example into
fOuter<-function()
{
fInner<-function()
{
I<<-I+1
}
I<-0
fInner()
print(I)
}
But what if I don't want to nest the functions that way? In my case fInner is a very heavy and complicated function (which I might want to move into my library), and fOuter is really an ad-hoc created function that defines one iterator. I will have many fOuter-class functions in my code, so nesting them will require duplicating the fInner inside every single little one iterator definition.
I suspect the solution has something to do with environments, but I don't know how to do it. Can anyone help?

You need access to the environment associated with the fOuter function--where the call to fInner is. You can get this with parent.frame, and you can get and set variables with get and assign:
fInner<-function()
{
assign("I", get("I", envir=parent.frame()) + 1, envir=parent.frame())
}
fOuter<-function()
{
I<-0
fInner()
print(I)
}
See ?environment and ?parent.frame for more info.
But this is just to satisfy your curiosity! You seem to agree that this is not a good idea. Manipulating environments and names can quickly get complicated, and unless you're doing something truly complex, there's probably a better way.

Related

R: IF object is TRUE then assign object NOT WORKING

I am trying to write a very basic IF statement in R and am stuck. I thought I'd find someone with the same problem, but I cant. Im sorry if this has been solved before.
I want to check if a variable/object has been assigned, IF TRUE I want to execute a function that is part of a R-package. First I wrote
FileAssignment <- function(x){
if(exists("x")==TRUE){
print("yes!")
x <- parse.vdjtools(x)
} else { print("Nope!")}
}
I assign a filename as x
FILENAME <- "FILENAME.txt"
I run the function
FileAssignment(FILENAME)
I use print("yes!") and print("Nope!") to check if the IF-Statement works, and it does. However, the parse.vdjtools(x) part is not assigned. Now I tested the same IF-statement outside of the function:
if(exists("FILENAME1")==TRUE){
FILENAME1 <- parse.vdjtools(FILENAME1)
}
This works. I read here that it might be because the function uses {} and the if-statement does too. So I should remove the brackets from the if-statement.
FileAssignment <- function(x){
if(exists("x")==TRUE)
x <- parse.vdjtools(x)
else { print("Nope!")
}
Did not work either.
I thought it might be related to the specific parse.vdjtools(x) function, so I just tried assigning a normal value to x with x <- 20. Also did not work inside the function, however, it does outside.
I dont really know what you are trying to acheive, but I wpuld say that the use of exists in this context is wrong. There is no way that the x cannot exist inside the function. See this example
# All this does is report if x exists
f <- function(x){
if(exists("x"))
cat("Found x!", fill = TRUE)
}
f()
f("a")
f(iris)
# All will be found!
Investigate file.exists instead? This is vectorised, so a vector of files can be investigated at the same time.
The question that you are asking is less trivial than you seem to believe. There are two points that should be addressed to obtain the desired behavior, and especially the first one is somewhat tricky:
As pointed out by #NJBurgo and #KonradRudolph the variable x will always exist within the function since it is an argument of the function. In your case the function exists() should therefore not check whether the variable x is defined. Instead, it should be used to verify whether a variable with a name corresponding to the character string stored in x exists.
This is achieved by using a combination of deparse() and
substitute():
if (exists(deparse(substitute(x)))) { …
Since x is defined only within the scope of the function, the superassignment operator <<- would be required to make a value assigned to x visible outside the function, as suggested by #thothai. However, functions should not have such side effects. Problems with this kind of programming include possible conflicts with another variable named x that could be defined in a different context outside the function body, as well as a lack of clarity concerning the operations performed by the function.
A better way is to return the value instead of assigning it to a variable.
Combining these two aspects, the function could be rewritten like this:
FileAssignment <- function(x){
if (exists(deparse(substitute(x)))) {
print("yes!")
return(parse.vdjtools(x))
} else {
print("Nope!")
return(NULL)}
}
In this version of the function, the scope of x is limited to the function body and the function has no side effects. The return value of FileAssignment(a) is either parse.vdjtools(a) or NULL, depending on whether a exists or not.
Outside the function, this value can be assigned to x with
x <- FileAssignment(a)

R - How to handle the dot-dot-dot (ellipis/"...") with multiple subsequent functions - i.e. passing only some of the variables

I'm working on a non-linear optimization, with the constrOptim.nl from the alabama package. However, my problem is more related to passing arguments (and the dot-dot-dot (ellipis/"...") and maybe do.call)- so I give first a general example and later refer to the constrOptim.nl function.
Suppose, I have the following functions - from which I only can edit the second and third but not the first.
first<-function (abc, second, third, ...){
second(abc,...)
third(abc,...)
}
second<- function(abc, ttt='nothing special'){
print(abc)
print(ttt)
}
third<- function(abc, zzz="default"){
print(abc)
print(zzz)
}
The output I want is the same I would get when I just run
second("test", ttt='something special')
third("test", zzz="non-default")
This is
"test"
"something special"
"test"
"non-default"
However, the code below doesn't work to do this.
first("test",second=second, third=third, ttt='something special',zzz="non-default")
How can I change the call or the second and third function to make it work?
http://www.r-bloggers.com/r-three-dots-ellipsis/
here I found some advice that do.call could help me but at the moment I'm not capable of understanding how it should work.
I cannot change the first function since this is the constrOptim.nl in my particular problem - and it is designed to be capable of passing more arguments to different functions. However, I can change the second and third function - as they are the restrictions and the function that I want to minimize. Obviously I can also change the call of the function.
So to be more particular, here is my specific problem:
I perform a maximum likelihood estimation with non-linear restrictions:
minimize <- function(Param,VARresiduals){
#Blahblah
for (index in 1:nrow(VARreisduals)){
#Likelihood Blahbla
}
return(LogL)
}
heq<-function(Param,W){
B<-Param[1:16]
restriction[1]<-Lrestriction%*%(diag(4)%x%(solve(W))%*%as.vector(B))
restriction[2:6]<-#BlablaMoreRestrictions
return(restriction)
}
Now I call the constrOptim.nl...
constrOptim.nl(par=rnorm(20), fn=minimize,hin=NULL heq=heq,VARresiduals,W)
...but get the same error, as I receive when I call the first function above - something like: "Error in second(abc, ...) : unused argument (zzz = "non-default")".
How can I change minimize and heq or the call? :) Thanks in Advance
Update after the post got marked as a duplicate:
The answer to the related post changes the first function in my example - as it implements a do.call there, that calls the other functions. However, I cannot change the first function in my example as I want to keep the constrOptim.nl working a variety of different functions. Is there another way?
The solution I came up with is not very elegant but it works.
second_2<- function(abc, extras){
a<-extras[[1]]
print(abc)
print(a)
}
third_2<- function(abc, extras){
a<-extras[[2]]
print(abc)
print(a)
}
extras<-list()
extras[[1]]<-'something special'
extras[[2]]<-"non-default"
extras
first("test",second=second_2, third=third_2, extras)
surprisingly also the following code works, but with a slightly different outcome
first("test",second=second, third=third, extras)
after all, setting default values is now a little clumsy but not infeasible.

Write R function with control flow that depends on argument value

For small function, it is trivial to just write conditional statement based on the argument value. For example, I have a function that extracts variable label from an ex-STATA dataframe. There are two options for output-type, environment and df.
f_extract_stata_label <- function(df, output="environment") {
if (output=="env") {
lab_env <- new.env()
for (i in seq_along(names(df))) {
lab_env[[names(df)[i]]] <- attr(df, "var.labels")[i]
}
return(lab_env)
} else if (output=="df") {
lab_df <- data.frame(var.name = names(d_tmp),
var.label = attr(d_tmp, "var.labels"))
return(lab_df)
}
}
However, I suspect that this is not good R idiom. First, how the function depends on output is not clear -- the reader has to read half way through the code to find out. Second, adding options to output in the future makes the function very hard to read.
So how should I rewrite this function?
R uses this kind of pattern in its core stats libraries where "label" strings make sense. These are functions where R's dispatch system is not that useful. That said, what you want is still dispatch-like.
You could refactor it to use a switch that calls a function dedicated to a specific output type. Two things happen then. First, the extra function call makes it clear what context you're in when using the traceback. Second, it makes the functions smaller and easier to read.
I would question whether you really want to use a dispatch function though, and why separate direct functions are not appropriate.

How to change the loop structure into sapply function?

I had changed the for loop into sapply function ,but failed .
I want to know why ?
list.files("c:/",pattern="mp3$",recursive=TRUE,full.names=TRUE)->z
c()->w
left<-basename(z)
for (i in 1:length(z)){
if (is.element(basename(z[i]),left))
{
append(w,values=z[i])->w;
setdiff(left,basename(z[i]))->left
}}
print(w)
list.files("c:/",pattern="mp3$",recursive=TRUE,full.names=TRUE)->z
c()->w
left<-basename(z)
sapply(z,function(y){
if (is.element(basename(y),left))
{ append(w,values=y)->w;
setdiff(left,basename(y))->left
}})
print(w)
my rule of selecting music is that if the basename(music) is the same ,then save only one full.name of music ,so unique can not be used directly.there are two concepts full.name and basename in file path which can confuse people here.
The problem you have here is that you want your function to have two side-effects. By side-effect, I mean modify objects that are outside its scope:w and left.
Curently, w and left are only modified within the function's scope, then they are eventually lost as the function call ends.
Instead, you want to modify w and left outside the function's environment. For that you can use <<- instead of <-:
sapply(z, function(y) {
if (is.element(basename(y),left)) {
w <<- append(w, values = y)
left <<- setdiff(left, basename(y))
}
})
Note that I have been saying "you want", "you can", but this is not what "you should" do. Functions with side-effects are really considered bad programming. Try reading about it.
Also, it is good to reserve the *apply tools to functions that can run their inputs independently. Here instead, you have an algorithm where the outcome of an iteration depends on the outcome of the previous ones. These are cases where you're better off using a for loop, unless you can rethink the algorithm in a framework that better suits *apply or can make use of functions that can handle such dependent situations: filter, unique, rle, etc.
For example, using unique, your code can be rewritten as:
base.names <- basename(z)
left <- unique(base.names)
w <- z[match(left, base.names)]
It also has the advantage that it is not building an object recursively, another no-no of your current code.

find all functions in a package that use a function

I would like to find all functions in a package that use a function. By functionB "using" functionA I mean that there exists a set of parameters such that functionA is called when functionB is given those parameters.
Also, it would be nice to be able to control the level at which the results are reported. For example, if I have the following:
outer_fn <- function(a,b,c) {
inner_fn <- function(a,b) {
my_arg <- function(a) {
a^2
}
my_arg(a)
}
inner_fn(a,b)
}
I might or might not care to have inner_fn reported. Probably in most cases not, but I think this might be difficult to do.
Can someone give me some direction on this?
Thanks
A small step to find uses of functions is to find where the function name is used. Here's a small example of how to do that:
findRefs <- function(pkg, fn) {
ns <- getNamespace(pkg)
found <- vapply(ls(ns, all.names=TRUE), function(n) {
f <- get(n, ns)
is.function(f) && fn %in% all.names(body(f))
}, logical(1))
names(found[found])
}
findRefs('stats', 'lm.fit')
#[1] "add1.lm" "aov" "drop1.lm" "lm" "promax"
...To go further you'd need to analyze the body to ensure it is a function call or the FUN argument to an apply-like function or the f argument to Map etc... - so in the general case, it is nearly impossible to find all legal references...
Then you should really also check that getting the name from that function's environment returns the same function you are looking for (it might use a different function with the same name)... This would actually handle your "inner function" case.
(Upgraded from a comment.) There is a very nice foodweb function in Mark Bravington's mvbutils package with a lot of this capability, including graphical representations of the resulting call graphs. This blog post gives a brief description.

Resources