How can I discriminate between functions in R - r

I have been trying to write a code capable of using different classification functions. However, the arguments are different depending on the classification function I use. I would like to have something like this :
classification_flow <- function(classification_function, ...) {
if (classification_function == randomForest) {
...
}
else if (classification_function == svm) {
...
}
}
Of course, this doesn't work since == wasn't built for functions. I've tried using str, names, attr, and looked a bit at methodsand UseMethod, but I can't find a suitable way to do so.
Can anybody help me?
Thanks,
Jess
PS : In this particular case, what I'm trying to do is to get a matrix of probabilities as the output, so something like that :
classification_flow <- function(classification_function, train, classes, ...) {
if (classification_function == randomForest) {
mat = classification_function(train, classes, type="prob")
}
else if (classification_function == svm) {
mat = classification_function(train, classes, probabilities = T)
}
return(mat)
}
If you know a more elegant solution...

You are looking for substitute:
f <- function(x, FUN) {
if (substitute(FUN) == 'max') {
print('Max invoked')
}
FUN(x)
}
> f(1:4, sum)
[1] 10
> f(1:4, max)
[1] "Max invoked"
[1] 4

Here's a version that can get a character or a function. Then you can do character comparisons to find the right case.
classification_flow <- function(classification_function, train, classes, ...) {
if (is.function(classification_function)) {
fname <- deparse(substitute(classification_function))
} else if (is.character(classification_function)) {
fname <- classification_function
classification_function < - get(classification_function)
} else {
stop("invalid classification_function")
}
if (fname == "randomForest") {
mat = classification_function(train, classes, type="prob")
}
else if (fname == "svm") {
mat = classification_function(train, classes, probabilities = T)
}
return(mat)
}

First, make the input to classification_function a character input.
Then, use the switch function to chose between your two options like so:
classification_flow <- function(classification_function, train, classes, ...) {
switch(classification_function,
randomForrest= {mat<-classification_function(train, classes, type="prob")},
svm = { mat<-classification_function(train, classes, probabilities = T) },
stop("You did not pick randomForrest or svm")
)
return(mat)
}
Edit: Added the stop line that gives an error message if neither choice is selected. After you designate all of the options (e.g. svm=) you can add a final line to be executed if there are no prior matches.

You may be able to use the formals function to determine which arguments the function is expecting, then call it accordingly. Also see do.call for a way to dynamically create a function call and call it.

Related

How do I determine the number of arguments of a user-supplied function?

I have a function myfun which among other arguments has one that is a user supplied function, say f. This function may have any number of arguments, including maybe none. Here is a simple example:
myfun = function(f, ...) { f()}
Now calls to myfun might be
myfun( f=function() rnorm(10) )
myfun( f=function(m) rnorm(10, m) )
I don't want to use the ellipse argument ... inside of f, so my question is whether there is any other way to determine inside of myfun how many arguments the function f has? If f has no arguments it is then passed to the Rcpp routine doA.cpp, but if it has one or more arguments it is passed to doB.cpp. So I need to know inside myfun which it is.
Here is a toy example which hopefully makes it clearer what I am after:
myfun = function(f) {
numarg = number.of.arguments(f)
if(numarg==0) return(doA.cpp(f))
else return(doB.cpp(f))
}
so I need a "function" number.of.arguments, that is some way to determine numarg.
Based on your specific use-case, your best bet might be to query the formal arguments of f. However, note that there are several caveats with this method, which I note below.
f = function (f) {
if (length(formals(f)) == 0L) {
doA.cpp(f)
} else {
doB.cpp(f)
}
}
The caveats are that formals does not work for primitive functions: formals(mean) works, but formals(sum) returns NULL. Furthermore, formals counts ... as a single argument. So if you want to handle ... differently you'll have to do this manually:
if ('...' %in% names(formals(f))) {
# `...` is present
} else {
# `...` is not present
}
A more robust method when the user supplies the arguments is to find the length of the ... args via ...length().
You could then pass the ... arguments to doB.cpp inside a list, for instance:
myfun = function(f, ...) {
if (...length() == 0L) {
doA.cpp(f)
} else {
doB.cpp(f, list(...))
}
}
Konrad's is just what I need! Also, I learned about the formals function, which I had not seen before. Its limitations as discussed by Konrad won't matter for my case because the functions are ones the users have to write anyway, and I won't use ... . So, thanks!
Wolfgang
This is what I may code in R.
myfun = function(f, ...) { f(...)}
myfun( f=function(m) rnorm(10, m), m )
After getting comment from #Wolfgang Rolke, the question becomes better understood. Here is my second attempt.
myfun = function(f, ...) {
argg <- c(as.list(environment()), list(...))
numarg=length(argg)
if (numarg==1) { return( f()) }
if (numarg==2) { return(f(argg[[2]])) }
}
myfun( f=function() rnorm(10) )
myfun( f=function(m=2) rnorm(10, m) )
To easily verify the result, one may do the following:
myfun( f=function() mean(rnorm(10))) # it returns something like 0.07599287
myfun( f=function(m=10) mean(rnorm(10, m))) # it returns 9.49364

Finding all variables created by assignment - Not working for pairlist

I'm currently doing Advanced-R, 18 Expressions.
Topic is about 18.5.2 Finding all variables created by assignment, but the given code doesn't work in the case of pairlist.
I followed all the given codes, but the results are not quite same with what I expect.
To begin with, in order to figure out what the type of the input, expr_type() is needed.
expr_type <- function(x) {
if(rlang::is_syntactic_literal(x)) {
"constant"
} else if (is.symbol(x)) {
"symbol"
} else if (is.call(x)) {
"call"
} else if (is.pairlist(x)) {
"pairlist"
} else {
typeof(x)
}
}
And the author, hadley, coupled this with a wrapper around the switch function.
switch_expr <- function(x, ...) {
switch(expr_type(x),
...,
stop("Don't know how to handle type ", typeof(x), call. = FALSE)
)
}
In the case of base cases, symbol and constant, is trivial because neither represents assignment.
find_assign_rec <- function(x) {
switch_expr(x,
constant = ,
symbol = character()
)
}
In the case of recursive cases, especially for pairlists, he suggested
flat_map_chr <- function(.x, .f, ...) {
purrr::flatten_chr(purrr::map(.x, .f, ...))
}
So summing up, it follows
find_assign_rec <- function(x) {
switch_expr(x,
# Base cases
constant = ,
symbol = character(),
# Recursive cases
pairlist = flat_map_chr(as.list(x), find_assign_rec),
)
}
find_assign <- function(x) find_assign_rec(enexpr(x))
Then, I expect in the case of pl <- pairlist(x = 1, y = 2), find_assign(pl) should return #> [1] "x" "y"
But the actual output is character(0)
What is wrong with this?

R: substitue and execution environments

I am currently writing a function that will take an equation as an argument. The function will expect variables to be apart of the column names of data.
mydata <- data.frame(x=c(1,2,3,4),y=c(5,6,7,8), z=c(9,10,11,12))
my_function <- function(data, equ) {
EQU.sub <- deparse(substitute(equ))
#Check if colnames are used
for(i in 1:length(colnames(data)) {
if(str_detect(string = EQU.sub, pattern = colnames(data)[i])) {
#if used, create variable with its name.
assign(x = colnames(data)[i],
value = eval(parse(text = paste("data$",
colnames(data),
sep = ""))))
} else {
warning(paste(colnames[i], "was not used in EQU"))
}
}
df$new.value <- eval(equ)
output <- function(new.equ = equ)
return(df)
}
my_function(data = mydata, equ = x+(y^2))
I know what you may be thinking, this is a big workaround for just doing
mydata$x+(mydata$y^2)
THE ISSUE
The issue is that I want to pass my input of equ into an new function.
new_function <- function(new.equ) {
string <- deparse(substitute(new.equ))
#does some stuff....
return(output) }
however, when changing from execution environment of my_function to new_function, calling deparse(substitute(equ)) returns "equ" instead of "x+(y^2)"
I know that the function substitute returns what was explicitly assigned to the variable. (equ) but I am wondering if there is a way for new_function() to be able to see into the execution environment of my_function() so I can get the desired output of "x+(y^2)"
UPDATE
After thinking about it, I could change what I pass to new.equ to the deparsed version of equ as follows...
output <- function(new.equ = EQU.sub)
new_function <- function(new.equ) {
#given that these variables are available
value <- parse(text = new.equ)
#does some stuff....
return(output) }
but my original question still stands because I'm still new to R environments. Is there a more elegant way to go through execution environments?
Using non-standard evaulation like this can be pretty messy. Rather than trying to capture expressions from promises passed to functions, it's much safer just to pass a formula. For example
mydata <- data.frame(x=c(1,2,3,4),y=c(5,6,7,8), z=c(9,10,11,12))
my_function <- function(data, equ) {
stopifnot(inherits(equ, "formula"))
eval(equ[[2]], data)
}
new_function <- function(newequ) {
my_function(mydata, newequ)
}
my_function(mydata, ~x+(y^2))
new_function(~x+(y^2))
Or give your function an extra parameter where you can pass an expression instead so you don't have to rely on a promise. This makes it much easier to write other functions that can call your function.
my_function <- function(data, equ, .equ=substitute(equ)) {
eval(.equ, data)
}
new_function <- function(newequ) {
equ <- substitute(newequ)
my_function(mydata, .equ=equ)
}
my_function(mydata, x+(y^2))
new_function(x+(y^2))
my_function(mydata, .equ=quote(x+(y^2)))

R Error Genetic Programming Implementation

So I am brand new to R. I started learning it yesterday, because there's some data that is being very resistant to automatically importing into Mathematica and Python. I'm building a few machine learning techniques to do analysis on the data that I can now import with R. This is a genetic programming implementation that when finished should do symbolic regression on some data. (I have yet to create the mutation or crossover operators, build a legit function list, etc). I get two errors when I run the script:
> Error: attempt to apply non-function
> print(bestDude)
> Error in print(bestDude) : object 'bestDude' not found
This is my code:
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
replace(rndNumVector, rndNumVector==i,funcList[[i]])
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list("null")
for(i in 2:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed, functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
populationFuncList<-function(Population)
{
Population[[1]]<-"x"
funCreator<-function(snippet)
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
listOfFunctions <- lapply(setNames(Population,names(Population)),function(x){funCreator(x)})
return(listOfFunctions)
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList, inputData,outputData)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-list()
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList<-populationFuncList(POpulation)
bestDude<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput)
print(bestDude)
The code is now working thanks to Hack-R's suggestions. So here's the finalized code in case someone else runs into a similar trouble.
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
rndNumVector[rndNumVector==i]<-funcList[i]
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list()
for(i in 1:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed,functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
funCreator<-function(snippet)
{
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList, inputData,outputData)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-vector(mode="numeric",length=length(inputData))
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList <- lapply(setNames(POpulation,names(POpulation)),function(x){funCreator(x)})
bestDude<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput)
print(bestDude)
In your function evalPopulation you're attempting to apply populationFuncList[[i]] as if it were a function, but when you pass in the argument POpulationFuncList to replace the variable populationFuncList it's not a function, it's a list.
I'm not sure what you were trying to do, so I'm not sure which way you want to fix this. If you meant to use a function you should change the name of the object you're referencing to the function and remove it as an argument or at least pass a function in as an argument instead of the list.
OTOH if you meant to use the list POpulationFuncList then you just shouldn't be applying it as if it were a function instead of a list.
On a side note, this probably would be more apparent if you didn't give them such similar names.
Another potential problem is that you seem have non-numeric results in one of your lists:
> populationFuncList(POpulation)
$x1
[1] "x"
$x2
[1] 2
$x3
[1] 1
$x4
[1] 1
You can't take the absolute value of the character "x", so I just wanted to make sure you're aware of this.
A third problem is that you're doing math on a non-numeric data typed object called total. You need to either change the type to numeric or index it appropriately.
Now it's impossible for me to know exactly which of an infinite number of possibilities you should choose to fix this, because I don't know the details of your use case. However, here is one possible solution which you should be able to adapt to the specifics of the use case:
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
replace(rndNumVector, rndNumVector==i,funcList[[i]])
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list("null")
for(i in 2:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed, functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
populationFuncList<-function(Population)
{
Population[[1]]<-"x"
funCreator<-function(snippet)
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
listOfFunctions <- lapply(setNames(Population,names(Population)),function(x){funCreator(x)})
return(listOfFunctions)
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(myList=myList, dataForInput,desiredFuncOutput)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-0
for(z in 1:length(dataForInput))
{
total<-c(total,(abs(myList[[i]]+(dataForInput[[z]])-desiredFuncOutput[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
myList <-populationFuncList(POpulation)[2:4]
bestDude<-evalPopulation(myList,dataForInput,desiredFuncOutput)
print(bestDude)
[[1]]
[1] 1.825742

Is it possible to see source code of a value of function

I am using a function from a package. this function returns some values. For example:
k<-dtw(v1,v2, keep.internals=TRUE)
and I can get this value:
k$costMatrix
Does it possible to see the source code of costMatrix? if yes how can I do that?
UPDATE
this is the source code of the function:
function (x, y = NULL, dist.method = "Euclidean", step.pattern = symmetric2,
window.type = "none", keep.internals = FALSE, distance.only = FALSE,
open.end = FALSE, open.begin = FALSE, ...)
{
lm <- NULL
if (is.null(y)) {
if (!is.matrix(x))
stop("Single argument requires a global cost matrix")
lm <- x
}
else if (is.character(dist.method)) {
x <- as.matrix(x)
y <- as.matrix(y)
lm <- proxy::dist(x, y, method = dist.method)
}
else if (is.function(dist.method)) {
stop("Unimplemented")
}
else {
stop("dist.method should be a character method supported by proxy::dist()")
}
wfun <- .canonicalizeWindowFunction(window.type)
dir <- step.pattern
norm <- attr(dir, "norm")
if (!is.null(list(...)$partial)) {
warning("Argument `partial' is obsolete. Use `open.end' instead")
open.end <- TRUE
}
n <- nrow(lm)
m <- ncol(lm)
if (open.begin) {
if (is.na(norm) || norm != "N") {
stop("Open-begin requires step patterns with 'N' normalization (e.g. asymmetric, or R-J types (c)). See papers in citation().")
}
lm <- rbind(0, lm)
np <- n + 1
precm <- matrix(NA, nrow = np, ncol = m)
precm[1, ] <- 0
}
else {
precm <- NULL
np <- n
}
gcm <- globalCostMatrix(lm, step.matrix = dir, window.function = wfun,
seed = precm, ...)
gcm$N <- n
gcm$M <- m
gcm$call <- match.call()
gcm$openEnd <- open.end
gcm$openBegin <- open.begin
gcm$windowFunction <- wfun
lastcol <- gcm$costMatrix[np, ]
if (is.na(norm)) {
}
else if (norm == "N+M") {
lastcol <- lastcol/(n + (1:m))
}
else if (norm == "N") {
lastcol <- lastcol/n
}
else if (norm == "M") {
lastcol <- lastcol/(1:m)
}
gcm$jmin <- m
if (open.end) {
if (is.na(norm)) {
stop("Open-end alignments require normalizable step patterns")
}
gcm$jmin <- which.min(lastcol)
}
gcm$distance <- gcm$costMatrix[np, gcm$jmin]
if (is.na(gcm$distance)) {
stop("No warping path exists that is allowed by costraints")
}
if (!is.na(norm)) {
gcm$normalizedDistance <- lastcol[gcm$jmin]
}
else {
gcm$normalizedDistance <- NA
}
if (!distance.only) {
mapping <- backtrack(gcm)
gcm <- c(gcm, mapping)
}
if (open.begin) {
gcm$index1 <- gcm$index1[-1] - 1
gcm$index2 <- gcm$index2[-1]
lm <- lm[-1, ]
gcm$costMatrix <- gcm$costMatrix[-1, ]
gcm$directionMatrix <- gcm$directionMatrix[-1, ]
}
if (!keep.internals) {
gcm$costMatrix <- NULL
gcm$directionMatrix <- NULL
}
else {
gcm$localCostMatrix <- lm
if (!is.null(y)) {
gcm$query <- x
gcm$reference <- y
}
}
class(gcm) <- "dtw"
return(gcm)
}
but if I write globalCostMatrix I dont get the source code of this function
The easiest way to find how functions work is by looking at the source. You have a good chance that by typing function name in the R console, you will get the function definitions (although not always with good layout, so seeking the source where brackets are present, is a viable option).
In your case, you have a function dtw from the same name package. This function uses a function called globalCostMatrix. If you type that name into R, you will get an error that object was not found. This happens because the function was not exported when the package was created, probably because the author thinks this is not something a regular user would use (but not see!) or to prevent clashes with other packages who may use the same function name.
However, for an interested reader, there are at least two ways to access the code in this function. One is by going to CRAN, downloading the source tarballs and finding the function in the R folder of the tar ball. The other one, easier, is by using getAnywhere function. This will give you the definition of the function just like you're used for other, user accessible functions like dtw.
> library(dtw)
> getAnywhere("globalCostMatrix")
A single object matching ‘globalCostMatrix’ was found
It was found in the following places
namespace:dtw
with value
function (lm, step.matrix = symmetric1, window.function = noWindow,
native = TRUE, seed = NULL, ...)
{
if (!is.stepPattern(step.matrix))
stop("step.matrix is no stepMatrix object")
n <- nrow(lm)
... omitted for brevity
I think you want to see what the function dtw() does with your data. I seems that it creates a data.frame containing a column named costMatrix.
To find out how the data in the column costMatrix was generated, just type and execute dtw (without brackets!). R will show you the source of the function dtw() afterwards.

Resources