Can anyone help me understand what a wrapper function in r is? I would really appreciate if you could explain it with the help of examples on building one's own wrapper function and when to use one.
Thanks in advance.
Say I want to use mean() but I want to set some default arguments and my usecase doesn't allow me to add additional arguments when I'm actually calling mean().
I could create a wrapper function:
mean_noNA <- function(x) {
return(mean(x, na.rm = T))
}
mean_noNA is a wrapper for mean() where we have set na.rm to TRUE.
Now we could use mean_noNA(x) the same as mean(x, na.rm = T).
Wrapper functions occur in any programming language, and they just mean that you are "wrapping" one function inside another function that alters how it works in some useful way. When we refer to a "wrapper" function we mean a function that the main purpose of the function is to call some internal function; there may be some alteration or additional computation in the wrapper, but this is sufficiently minor that the original function constitutes the bulk of the computation.
As an example, consider the following wrapper function for the log function in R. One of the drawbacks of the original function is that it does not work properly for negative numeric inputs (it gives NaN with a warning message). We can remedy this by creating a "wrapper" function that turns it into the complex logarithm:
Log <- function(x, base = exp(1)) {
LOG <- base::log(as.complex(x), base = base)
if (all(Im(LOG) == 0)) { LOG <- Re(LOG) }
LOG }
The function Log is a "wrapper" for log that adjusts it so that it will now accept numeric or complex inputs, including negative numeric inputs. In the event that it receives a non-negative numeric or a complex input it gives the same output the original log function. However, if it is given a negative numeric input it gives the complex output that should be returned by the complex logarithm.
Related
I am looking to make a function where the user can enter their own model selection function as an input to be used. I'm having trouble with finding the answer as I keep getting search results about how to make a simple R function, as opposed to an input much like the apply family.
Here is an example similar to what I am looking for but not quite:
simple<- function(mod, FUN){
switch(FUN,
AIC = AIC(mod),
BIC = BIC(mod))
}
simple(lm(rnorm(100) ~ rnorm(100,4)), "AIC")
The above code runs but I must plan for all of the possible functions and write them within switch. I also am forced to write "AIC" as opposed to simply AIC.
Any thoughts to how I can create the function I am looking for? Let me know if you need additional information.
Something like this:
simple<- function(mod, FUN){
FUN <- match.fun(FUN)
FUN(mod)
}
simple(lm(rnorm(100) ~ rnorm(100,4)),FUN = "BIC")
match.fun accepts a function, symbol or character, so there is some flexibility in how the FUN argument is passed.
An option for passing multiple functions, as mentioned in the comments:
simple <- function(mod, FUN){
FUNS <- lapply(FUN,match.fun)
lapply(FUNS,function(fun) fun(mod))
}
I am trying to make a function in R that calculates the mean of nitrate, sulfate and ID. My original dataframe have 4 columns (date,nitrate, sulfulfate,ID). So I designed the next code
prueba<-read.csv("C:/Users/User/Desktop/coursera/001.csv",header=T)
columnmean<-function(y, removeNA=TRUE){ #y will be a matrix
whichnumeric<-sapply(y, is.numeric)#which columns are numeric
onlynumeric<-y[ , whichnumeric] #selecting just the numeric columns
nc<-ncol(onlynumeric) #lenght of onlynumeric
means<-numeric(nc)#empty vector for the means
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
}
columnmean(prueba)
When I run my data without using the function(), but I use row by row with my data it will give me the mean values. Nevertheless if I try to use the function so it will make all the steps by itself, it wont mark me error but it also won't compute any value, as in my environment the dataframe 'prueba' and the columnmean function
what am I doing wrong?
A reproducible example would be nice (although not absolutely necessary in this case).
You need a final line return(means) at the end of your function. (Some old-school R users maintain that means alone is OK - R automatically returns the value of the last expression evaluated within the function whether return() is specified or not - but I feel that using return() explicitly is better practice.)
colMeans(y[sapply(y, is.numeric)], na.rm=TRUE)
is a slightly more compact way to achieve your goal (although there's nothing wrong with being a little more verbose if it makes your code easier for you to read and understand).
The result of an R function is the value of the last expression. Your last expression is:
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
It may seem strange that the value of that expression is NULL, but that's the way it is with for-loops in R. The means vector does get changed sequentially, which means that BenBolker's advice to use return(.) is correct (as his advice almost always is.) . For-loops in R are a notable exception to the functional programming paradigm. They provide a mechanism for looping (as do the various *apply functions) but the commands inside the loop exert their effects in the calling environment via side effects (unlike the apply functions).
I have this function
ANN<-function (x,y){
DV<-rep(c(0:1),5)
X1<-c(1:10)
X2<-c(2:11)
ANN<-neuralnet(x~y,hidden=10,algorithm='rprop+')
return(ANN)
}
I need the function run like
formula=X1+X2
ANN(DV,formula)
and get result of the function. So the problem is to say the function USE the object which was created during the run of function. I need to run trough lapply more combinations of x,y, so I need it this way. Any advices how to achieve it? Thanks
I've edited my answer, this still works for me. Does it work for you? Can you be specific about what sort of errors you are getting?
New response:
ANN<-function (y){
X1<-c(1:10)
DV<-rep(c(0:1),5)
X2<-c(2:11)
dat <- data.frame(X1,X2)
ANN<-neuralnet(DV ~y,hidden=10,algorithm='rprop+',data=dat)
return(ANN)
}
formula<-X1+X2
ANN(formula)
If you want so specify the two parts of the formula separately, you should still pass them as formulas.
library(neuralnet)
ANN<-function (x,y){
DV<-rep(c(0:1),5)
X1<-c(1:10)
X2<-c(2:11)
formula<-update(x,y)
ANN<-neuralnet(formula,data=data.frame(DV,X1,X2),
hidden=10,algorithm='rprop+')
return(ANN)
}
ANN(DV~., ~X1+X2)
And assuming you're using neuralnet() from the neuralnet library, it seems the data= is required so you'll need to pass in a data.frame with those columns.
Formulas as special because they are not evaluated unless explicitly requested to do so. This is different than just using a symbol, where as soon as you use it is evaluated to something in the proper frame. This means there's a big difference between DV (a "name") and DV~. (a formula). The latter is safer for passing around to functions and evaluating in a different context. Things get much trickier with symbols/names.
I am searching for a fast and simple way to give comprehensible names to the columns of a VAR-coefficient matrix.
What I would like to use is the function VAR.names, which is used in the function VAR.est() in the VAR.etp-package. When I use the function VAR.est(), this works perfectly, but as soon as I modify VAR.est (by adding another element to the list of values which are returned), I receive an error message stating "could not find function VAR.names".
I could not find any information on the function VAR.names.
Example:
library(VAR.etp)
data(dat)
M=VAR.est(dat,p=2,type="const")
M$coef
Another possibility would be to use a loop as in the function VAR() from the vars package, but if VAR.names would actually work, this would be a lot more elegant!
I'm having trouble applying a custom function in R. The basic setup is I have a bunch of points, I want to drop a grid over top and get the max z value from each cell.
The code I'm trying is below. The results I'm looking for would return myGrid$z=c(5,10,na). The na could be a different value as well as long as I could filter it out later. I'm getting an error at the apply stage
I believe there is an error in how I'm using apply, but I just haven't been able to get my head wrapped around apply.
thanks,
Gordon
myPoints<-data.frame(x=c(0.7,0.9,2),y=c(0.5,0.7,3), z=c(5,3,10))
myGrid<-data.frame(x=c(0.5,2,4),y=c(0.5,3,10))
grid_spacing = 1
get_max_z<-function(x,y) {
z<-max(myPoints$z[myPoints$x > (x-grid_spacing/2)
& myPoints$x <= (x+grid_spacing/2)
& myPoints$y > (y-grid_spacing/2)
& myPoints$y <= (y+grid_spacing/2)])
return(z)
}
myGrid$z<-apply(myGrid,1,get_max_z(x,y),x=myGrid$x,y=myGrid$y)
Edited to include the return(z) line I left out. added $y to line of custom function above return.
First of all I would recommend you to always boil down a question to its core instead of just posting code. But I think I know what your problem is:
> df <- data.frame(x = c(1,2,3), y = c(2,1,5))
> f <- function(x,y) {x+y}
> apply(df,1,function(d)f(d["x"],d["y"]))
[1] 3 3 8
apply(df,1,.) will traverse df row wise and hand the current row as an argument to the provided function. This row is a vector and passed into the anonymous function via the only available argument d. Now you can access the elements of the vector and hand them further down to your custom function f taking two parameters.
I think if you get this small piece of code then you know how to adjust in your case.
UPDATE:
Essentially you make two mistakes:
you hand a function call instead of a function to apply.
function call: get_max_z(x,y)
function: function(x,y)get_max_z(x,y)
you misinterpreted the meaning of "..." in the manual to apply as the way to hand over the arguments. But actually this is just the way to pass additional arguments independent of the traversed data object.