I have the following data frame:
> coc_comp_model[1:3,]
Relationship Output Input |r-Value| Y-Intercept Gradient
1 DG-r ~ DG-cl DG-r DG-cl 0.8271167 0.0027217513 12.9901380
2 CA3-r ~ CA3-cl CA3-r CA3-cl 0.7461309 0.0350767684 27.6107963
3 CA2-r ~ CA2-cl CA2-r CA2-cl 0.9732584 -0.0040992226 35.8299582
I want to create simple functions for each row of the data frame. here's what I've tried:
for(i in 1:nrow(coc_comp_model)) {
coc_glm_f[i] <- function(x)
x*coc_comp_model$Gradient[i] + coc_comp_model$Y-Intercept[i]
}
also tried making a vector of functions, which also does ont work either.
Thanks for reading this/helping.
Something like this:
myfunc<-function(datrow, x){
x*as.numeric(datrow[6]) + as.numeric(datrow[5] )
}
Then you can use apply to call it on each row, changing x as desired:
apply(hzdata, 1, myfunc, x = 0.5)
note: using dput() to share your data is much easier than a pasting in a subset.
There is no such thing as a vector of functions. There are 6 atomic vector types in R: raw, logical, integer, double, complex, and character, plus there is the heterogeneous list type, and finally there is the lesser known expression type, which is basically a vector of parse trees (such as you get from a call to the substitute() function). Those are all the vector types in R.
printAndType <- function(x) { print(x); typeof(x); };
printAndType(as.raw(1:3));
## [1] 01 02 03
## [1] "raw"
printAndType(c(T,F));
## [1] TRUE FALSE
## [1] "logical"
printAndType(1:3);
## [1] 1 2 3
## [1] "integer"
printAndType(as.double(1:3));
## [1] 1 2 3
## [1] "double"
printAndType(c(1i,2i,3i));
## [1] 0+1i 0+2i 0+3i
## [1] "complex"
printAndType(letters[1:3]);
## [1] "a" "b" "c"
## [1] "character"
printAndType(list(c(T,F),1:3,letters[1:3]));
## [[1]]
## [1] TRUE FALSE
##
## [[2]]
## [1] 1 2 3
##
## [[3]]
## [1] "a" "b" "c"
##
## [1] "list"
printAndType(expression(a+1,sum(1,2+3*4),if (T) 1 else 2));
## expression(a + 1, sum(1, 2 + 3 * 4), if (T) 1 else 2)
## [1] "expression"
If you want to store multiple functions in a single object, you have to use a list, and you must use the double-bracket indexing operator in the lvalue to assign to it:
fl <- list();
for (i in 1:3) fl[[i]] <- (function(i) { force(i); function(a) a+i; })(i);
fl;
## [[1]]
## function (a)
## a + i
## <environment: 0x600da11a0>
##
## [[2]]
## function (a)
## a + i
## <environment: 0x600da1ab0>
##
## [[3]]
## function (a)
## a + i
## <environment: 0x600da23f8>
sapply(fl,function(f) environment(f)$i);
## [1] 1 2 3
sapply(fl,function(f) f(3));
## [1] 4 5 6
In the above code I also demonstrate the proper way to closure around a loop variable. This requires creating a temporary function evaluation environment to hold a copy of i, and the returned function will then closure around that evaluation environment so that it can access the iteration-specific i. This holds true for other languages that support dynamic functions and closures, such as JavaScript. In R there is an additional requirement of forcing the promise to be resolved via force(), otherwise, for each generated function independently, the promise wouldn't be resolved until the first evaluation of that particular generated function, which would at that time lock in the current value of the promise target (the global i variable in this case) for that particular generated function. It should also be mentioned that this is an extremely wasteful design, to generate a temporary function for every iteration and evaluate it, which generates a new evaluation environment with a copy of the loop variable.
If you wanted to use this design then your code would become:
coc_glm_f <- list();
for (i in 1:nrow(coc_comp_model)) {
coc_glm_f[[i]] <- (function(i) { force(i); function(x) x*coc_comp_model$Gradient[i] + coc_comp_model$`Y-Intercept`[i]; })(i);
};
However, it probably doesn't make sense to create a separate function for every row of the data.frame. If you intended the x parameter to take a scalar value (by which I mean a one-element vector), then you can define the function as follows:
coc_glm_f <- function(x) x*coc_comp_model$Gradient + coc_comp_model$`Y-Intercept`;
This function is vectorized, meaning you can pass a vector for x, where each element of x would correspond to a row of coc_comp_model. For example:
coc_comp_model <- data.frame(Relationship=c('DG-r ~ DG-cl','CA3-r ~ CA3-cl','CA2-r ~ CA2-cl'),Output=c('DG-r','CA3-r','CA2-r'),Input=c('DG-cl','CA3-cl','CA2-cl'),`|r-Value|`=c(0.8271167,0.7461309,0.9732584),`Y-Intercept`=c(0.0027217513,0.0350767684,-0.0040992226),Gradient=c(12.9901380,27.6107963,35.8299582),check.names=F);
coc_glm_f(seq_len(nrow(coc_comp_model)));
## [1] 12.99286 55.25667 107.48578
Related
I'm having problems with applying a function to a vector of arguments. The point is, none of the arguments are vectors.
I'm trying to apply my function with the do.call command, and my attempts go like this:
do.call("bezmulti", list(dat$t, as.list(getvarnames(n, "a"))))
where bezmulti is a function that takes in a vector (dat$t) and an indefinite number of single numbers, which are provided by the function getvarnames in the form of a vector, which I need to split.
The problem is that this list doesn't work the way I want it to - the way I would want would be:
[[1]]
#vector goes here
[[2]]
#the
[[3]]
#numbers
[[4]]
#go
[[5]]
#here
however my proposed solution, and all my other solutions provide lists that are either somehow nested or have only two elements, both of which are vectors. Is there a way to force the list to be in the format above?
EDIT: Functions used in this post look as follows
bezmulti <- function(t,...) {
coeff <- list(...)
n <- length(coeff)-1
sumco <- rep(0, length(t))
for (i in c(0:n)) {
sumco=sumco+coeff[[i+1]]*choose(n, i)*(1-t)^(n-i)*t^i
}
return(sumco)
}
getvarnames <- function(n, charasd) {
vec=NULL
for (j in c(1:n)) {
vec <- append(vec, eval(str2expression(paste0(charasd, as.character(j)))))
}
return(vec)
}
I think what you need to do is this:
do.call("bezmulti", c(list(dat$t), as.list(getvarnames(n, "a"))))
For example:
dat= data.frame(t = c(1,2,3,4,6))
c(list(dat$t), as.list(c(8,10,12)))
Output:
[[1]]
[1] 1 2 3 4 6
[[2]]
[1] 8
[[3]]
[1] 10
[[4]]
[1] 12
Because this would simplify my code only slightly, I'm mainly asking out of curiosity. Say you have a function
f <- function(x) {
c(x[1] - x[2],
x[2] - x[3],
x[3] - x[1])
}
Is there a way of finding out the dimension of the input required, e.g.
dim(f) = 3
Here's a very questionable solution that involves computing on the language.
I've written three functions that allow matching, searching, and extracting pieces of parse trees based on a sort of "template" parse tree piece. Here they are:
ptlike <- function(lhs,rhs,wcs='.any.',wcf=NULL) if (is.symbol(rhs) && as.character(rhs)%in%wcs) !is.function(wcf) || isTRUE(wcf(lhs,as.character(rhs))) else typeof(lhs) == typeof(rhs) && length(lhs) == length(rhs) && if (is.call(lhs)) all(sapply(seq_along(lhs),function(i) ptlike(lhs[[i]],rhs[[i]],wcs,wcf))) else lhs == rhs;
ptfind <- function(ptspace,ptpat,wcs='.any.',wcf=NULL,locf=NULL,loc=integer(),ptspaceorig=ptspace) c(list(),if (ptlike(ptspace,ptpat,wcs,wcf) && (!is.function(locf) || isTRUE(locf(ptspaceorig,loc)))) list(loc),if (is.call(ptspace)) do.call(c,lapply(seq_along(ptspace),function(i) ptfind(ptspace[[i]],ptpat,wcs,wcf,locf,c(loc,i),ptspaceorig))));
ptextract <- function(ptspace,ptpat,gets='.get.',wcs='.any.',wcf=NULL,locf=NULL,getf=NULL) { getlocs <- do.call(c,lapply(gets,function(get) ptfind(ptpat,as.symbol(get),character()))); if (length(getlocs)==0L) getlocs <- list(integer()); c(list(),do.call(c,lapply(ptfind(ptspace,ptpat,unique(c(gets,wcs)),wcf,locf),function(loc) do.call(c,lapply(getlocs,function(getloc) { cloc <- c(loc,getloc); ptget <- if (length(cloc)>0) ptspace[[cloc]] else ptspace; if (!is.function(getf) || isTRUE(getf(if (missing(ptget)) substitute() else ptget,loc,getloc,as.character(ptpat[[getloc]])))) list(if (missing(ptget)) substitute() else ptget); }))))); };
ptlike() matches two parse tree pieces against each other, allowing for wildcards on the RHS to match anything. For example:
ptlike(1L,2L);
## [1] FALSE
ptlike(1L,1L);
## [1] TRUE
ptlike(quote(a),quote(b));
## [1] FALSE
ptlike(quote(a),quote(a));
## [1] TRUE
ptlike(quote(sum(a+1)),quote(sum(b+1)));
## [1] FALSE
ptlike(quote(sum(a+1)),quote(sum(.any.+1)));
## [1] TRUE
ptlike(quote(sum(a+1)),quote(.any.(a+1)));
## [1] TRUE
ptlike(quote(sum(a+1)),quote(.any.));
## [1] TRUE
ptfind() returns a recursive index vector for each match of a parse tree pattern (RHS, if you like) in a given parse tree space (LHS), combined into a list. For example:
sp <- quote({ a+b*c:d; e+f*g:h; });
ptfind(sp,quote(.any.:.any.));
## [[1]]
## [1] 2 3 3
##
## [[2]]
## [1] 3 3 3
##
sp[[c(2L,3L,3L)]];
## c:d
sp[[c(3L,3L,3L)]];
## g:h
ptextract() is similar to ptfind(), but returns the matched piece of the parse tree space or a subset thereof:
ptextract(sp,quote(c:d));
## [[1]]
## c:d
##
ptextract(sp,quote(.any.:.any.));
## [[1]]
## c:d
##
## [[2]]
## g:h
##
ptextract(sp,quote(.any.:.get.));
## [[1]]
## d
##
## [[2]]
## h
##
So what we can do is extract from (the parse tree that comprises the body of) your function all the subscripts of the x argument and get the maximum literal subscript value:
f <- function(x) c(x[1L]-x[2L],x[2L]-x[3L],x[3L]-x[1L]);
max(unlist(Filter(is.numeric,ptextract(body(f),quote(x[.get.])))));
## [1] 3
The Filter(is.numeric,...) piece isn't really necessary here, since all subscripts are literal numbers (doubles in your definition, integers in mine, although that's inconsequential), but if there were ever non-numeric-literal subscripts, then it would be necessary.
Caveats:
It's absurd.
It would not take into account variable subscripts that might rise above the maximum literal subscript value in the parse tree. (Although, strictly speaking, in the general case, it is impossible to statically analyze all possible subscripts that might occur at run-time, due to the halting problem or something.)
If the vector was ever assigned to a different variable name, say y, and then that variable was indexed with a different subscript, even a numeric literal subscript, this algorithm would not take that into account.
And there's probably some more.
Within a function, how can we reliably return an object that contains the function itself?
For example with:
functionBuilder <- function(wordToSay) {
function(otherWordToSay) {
print(wordToSay)
print(otherWordToSay)
get(as.character(match.call()[[1]]))
}
}
I can build a function like so:
functionToRun <- functionBuilder("hello nested world")
... and run it ...
functionToRun("A")
#[1] "hello nested world"
#[1] "A"
#
#function(otherWordToSay) {
# print(wordToSay)
# print(otherWordToSay)
# get(as.character(match.call()[[1]]))
# }
#<environment: 0x1e313678>
... as you can see functionToRun returns itself. However, this approach appears to break if I call functionToRun via sapply:
> sapply(LETTERS, functionToRun)
#[1] "hello nested world"
#[1] "A"
#Error in get(as.character(match.call()[[1]])) : object 'FUN' not found
I can see that this is because the actual call when using sapply is FUN but that FUN doesn't exist at pos = -1 (the default for get). Code that works in that position looks like:
get(as.character(match.call()[[1]]),envir = sys.frame(sys.parent()))
But that same code fails if the function hasn't been called via sapply because sys.frame(sys.parent())) goes too far back and ends up referring to R_GlobalEnv.
From the documentation (R 3.2.2) I'd have expected dynGet to perhaps solve the issue of only going as far back in the stack as needed. Although this works for an sapply call of the function, it fails when the function is called on its own. (Besides, it is marked as 'somewhat experimental'). Inversely getAnywhere seems promising, but doesn't seem to work for the sapply called function.
Is there a reliable way to return the function that is currently being processed, i.e. works for both a bare and sapply wrapped function call?
What I'm doing right now is wrapping the attempt to grab the function in a tryCatch; but I'm a little uncertain whether I can trust that get(as.character(match.call()[[1]]),envir = sys.frame(sys.parent())) will work in all wrapping cases (not just sapply). So, I'm looking for a more reasonable way to approach this problem.
Potentially Related Questions:
How to access a variable stored in a function in R
How to get the name of the calling function inside the called routine?
I can't guarantee that this will work in all cases, but it looks okay:
fun <- function(x) {
print(x)
y <- exp(x)
print(y)
sys.function(0)
}
fun(1)
# [1] 1
# [1] 2.718282
# function(x) {
# print(x)
# y <- exp(x)
# print(y)
# sys.function(0)
# }
lapply(1:5, fun)[[3]]
# [1] 1
# [1] 2.718282
# [1] 2
# [1] 7.389056
# [1] 3
# [1] 20.08554
# [1] 4
# [1] 54.59815
# [1] 5
# [1] 148.4132
# function(x) {
# print(x)
# y <- exp(x)
# print(y)
# sys.function(0)
# }
Of course, I don't understand what you need this for.
I am new to R and have a question on the function posted here: R RStudio Resetting debug / function environment. Why are the objects set to themselves (e.g. "getmean = getmean" etc.)? Couldn't it simply be written as follows: list(set, get, setmean, getmean)
The difference is that
aa <- list(set, get, setmean, getmean)
is an unnamed list and
bb <- list(set=set, get=get, setmean=setmean, getmean=getmean)
is a named list. Compare names(aa) and names(bb).
And that = is not assignment. It's really just giving a label to a list item. It's one of the reasons R programmers try to only use <- for assignment and leave = with this special meaning. You could have easily also done
cc <- list(apple=set, banana=get, ornage=setmean, grape=getmean)
cc$apple()
It doesn't have to be the exact same name.
Because list(set, get, setmean, getmean) won't tag the list elements with the correct names. Here's an example of the difference between tagged and untagged lists:
> list(1, 2, 3)
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
> list(foo=1, bar=2, baz=3)
$foo
[1] 1
$bar
[1] 2
$baz
[1] 3
Note that in the context of argument lists, = is used to supply named arguments, it does not do any assignments (unlike <-). Thus list(foo=1, bar=2, baz=3) is very different from list(foo<-1, bar<-2, baz<-3).
The question has been answered, but you could also do this to achieve the same result.
> object <- c('set', 'get', 'setmean', 'getmean')
> setNames(object = as.list(object), nm = object)
# $set
# [1] "set"
#
# $get
# [1] "get"
#
# $setmean
# [1] "setmean"
#
# $getmean
# [1] "getmean"
The quotations are dependent on what these values actually are.
And you can set different names with like this
> setNames(as.list(object), letters[1:4])
# $a
# [1] "set"
#
# $b
# [1] "get"
#
# $c
# [1] "setmean"
#
# $d
# [1] "getmean"
setNames comes in handy when working with lapply.
By playing around with a function in R, I found out there are more aspects to it than meets the eye.
Consider ths simple function assignment, typed directly in the console:
f <- function(x)x^2
The usual "attributes" of f, in a broad sense, are (i) the list of formal arguments, (ii) the body expression and (iii) the environment that will be the enclosure of the function evaluation frame. They are accessible via:
> formals(f)
$x
> body(f)
x^2
> environment(f)
<environment: R_GlobalEnv>
Moreover, str returns more info attached to f:
> str(f)
function (x)
- attr(*, "srcref")=Class 'srcref' atomic [1:8] 1 6 1 19 6 19 1 1
.. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x00000000145a3cc8>
Let's try to reach them:
> attributes(f)
$srcref
function(x)x^2
This is being printed as a text, but it's stored as a numeric vector:
> c(attributes(f)$srcref)
[1] 1 6 1 19 6 19 1 1
And this object also has its own attributes:
> attributes(attributes(f)$srcref)
$srcfile
$class
[1] "srcref"
The first one is an environment, with 3 internal objects:
> mode(attributes(attributes(f)$srcref)$srcfile)
[1] "environment"
> ls(attributes(attributes(f)$srcref)$srcfile)
[1] "filename" "fixedNewlines" "lines"
> attributes(attributes(f)$srcref)$srcfile$filename
[1] ""
> attributes(attributes(f)$srcref)$srcfile$fixedNewlines
[1] TRUE
> attributes(attributes(f)$srcref)$srcfile$lines
[1] "f <- function(x)x^2" ""
There you are! This is the string used by R to print attributes(f)$srcref.
So the questions are:
Are there any other objects linked to f? If so, how to reach them?
If we strip f of its attributes, using attributes(f) <- NULL, it doesn't seem to affect the function. Are there any drawbacks of doing this?
As far as I know, srcref is the only attribute typically attached to S3 functions. (S4 functions are a different matter, and I wouldn't recommend messing with their sometimes numerous attributes).
The srcref attribute is used for things like enabling printing of comments included in a function's source code, and (for functions that have been sourced in from a file) for setting breakpoints by line number, using utils::findLineNum() and utils::setBreakpoint().
If you don't want your functions to carry such additional baggage, you can turn off recording of srcref by doing options(keep.source=FALSE). From ?options (which also documents the related keep.source.pkgs option):
‘keep.source’: When ‘TRUE’, the source code for functions (newly
defined or loaded) is stored internally allowing comments to
be kept in the right places. Retrieve the source by printing
or using ‘deparse(fn, control = "useSource")’.
Compare:
options(keep.source=TRUE)
f1 <- function(x) {
## This function is needlessly commented
x
}
options(keep.source=FALSE)
f2 <- function(x) {
## This one is too
x
}
length(attributes(f1))
# [1] 1
f1
# function(x) {
# ## This function is needlessly commented
# x
# }
length(attributes(f2))
# [1] 0
f2
# function (x)
# {
# x
# }
I jst figured out an attribute that compiled functions (package compiler) have that is not available with attributes or str. It's the bytecode.
Example:
require(compiler)
f <- function(x){ y <- 0; for(i in 1:length(x)) y <- y + x[i]; y }
g <- cmpfun(f)
The result is:
> print(f, useSource=FALSE)
function (x)
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
> print(g, useSource=FALSE)
function (x)
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
<bytecode: 0x0000000010eb29e0>
However, this doesn't show with normal commands:
> identical(f, g)
[1] TRUE
> identical(f, g, ignore.bytecode=FALSE)
[1] FALSE
> identical(body(f), body(g), ignore.bytecode=FALSE)
[1] TRUE
> identical(attributes(f), attributes(g), ignore.bytecode=FALSE)
[1] TRUE
It seems to be accessible only via .Internal(bodyCode(...)):
> .Internal(bodyCode(f))
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
> .Internal(bodyCode(g))
<bytecode: 0x0000000010eb29e0>