Let's say I have a list of data.frames
dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))
If i want to extract the first column from each item in the list, I can do
lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 10 11 12
Why can't I use the "$" function in the same way
lapply(dflist, `$`, "a")
# [[1]]
# NULL
#
# [[2]]
# NULL
But these both work:
lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")
I realize that in this case one could use
lapply(dflist, `[[`, "a")
but I was working with an S4 object that didn't seem to allow indexing via [[. For example
library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)
#works
catpop[[1]]$tab
#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) :
# no slot of name "..." for this object of class "genpop"
#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable
For the first example, you can just do:
lapply(dflist, `$.data.frame`, "a")
For the second, use the slot() accessor function
lapply(mylist, "slot", "tab")
I'm not sure why method dispatch doesn't work in the first case, but the Note section of ?lapply does address this very issue of its borked method dispatch for primitive functions like $:
Note:
[...]
For historical reasons, the calls created by ‘lapply’ are
unevaluated, and code has been written (e.g., ‘bquote’) that
relies on this. This means that the recorded call is always of
the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
(integer or double) index. This is not normally a problem, but it
can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
primitive function that makes use of the call. This means that it
is often safer to call primitive functions with a wrapper, so that
e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
that method dispatch for ‘is.numeric’ occurs correctly.
So it seems that this problem has more to do with $ and how it typically expects unquoted names as the second parameter rather than strings. Look at this example
dflist <- list(
data.frame(a=1:3, z=31:33),
data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist,
function(x, z) {
print(paste("z:",z));
`$`(x,z)
},
z="a"
)
We see the results
[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33
[[2]]
[1] 31 32 33
so the z value is being set to "a", but $ isn't evaluating the second parameter. So it's returning the "z" column rather than the "a" column. This leads to this interesting set of results
a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33
When we call $.data.frame directly we are bypassing the standard deparsing that occurs in the primitive prior to dispatching (which happens near here in the source).
The added catch with lapply is that it passes along arguments to the function via the ... mechanism. For example
lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)
# [[2]]
# FUN(X[[2L]], ...)
This means that when $ is invoked, it deparses the ... to the string "...". This explains this behavior
dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13
Same thing happens when you try to use ... yourself
f<-function(x,...) `$`(x, ...);
f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3
Related
Why don't lambda functions handle replacement functions in their natural form? For example, consider the length<- function. Say I want to standardize the lengths of a list of objects, I may do something like:
a <- list(c("20M1", "A1", "ACC1"), c("20M2", "A2", "ACC2"), c("20M3"))
mx <- max(lengths(a))
lapply(a, `length<-`, mx)
#> [[1]]
#> [1] "20M1" "A1" "ACC1"
#>
#> [[2]]
#> [1] "20M2" "A2" "ACC2"
#>
#> [[3]]
#> [1] "20M3" NA NA
However if I wanted to specify the argument input locations explicitly using a lambda function I'd need to do (which also works):
lapply(a, function(x) `length<-`(x, mx))
But why doesn't the more intuitive notation for replacement functions (see below) work?
lapply(a, function(x) length(x) <- mx)
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 3
#>
#> [[3]]
#> [1] 3
This returns an output I did not expect. What is going on here? Lambda functions seem to handle the intuitive form of infix functions, so I was a little surprised they don't work with the intuitive form of replacement functions. Why is this / is there a way to specify replacement functions in lambda functions using their intuitive form?
(I imagine it has something to do with the special operator <-... but would be curious for a solution or more precise explanation).
Whenever you do an assignment in R, the value returned from that expression is the right hand side value. This is true even for "special" versions of assign functions. For example if you do this
x <- 1:2; y <- (names(x) <- letters[1:2])
> y
[1] "a" "b"
You can see that y gets the values of the names, not the updated value of x.
In your case if you want to return the updated value itself, you need to do so explicitly
lapply(a, function(x) {length(x) <- mx; x})
Let's say I have a list of data.frames
dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))
If i want to extract the first column from each item in the list, I can do
lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 10 11 12
Why can't I use the "$" function in the same way
lapply(dflist, `$`, "a")
# [[1]]
# NULL
#
# [[2]]
# NULL
But these both work:
lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")
I realize that in this case one could use
lapply(dflist, `[[`, "a")
but I was working with an S4 object that didn't seem to allow indexing via [[. For example
library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)
#works
catpop[[1]]$tab
#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) :
# no slot of name "..." for this object of class "genpop"
#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable
For the first example, you can just do:
lapply(dflist, `$.data.frame`, "a")
For the second, use the slot() accessor function
lapply(mylist, "slot", "tab")
I'm not sure why method dispatch doesn't work in the first case, but the Note section of ?lapply does address this very issue of its borked method dispatch for primitive functions like $:
Note:
[...]
For historical reasons, the calls created by ‘lapply’ are
unevaluated, and code has been written (e.g., ‘bquote’) that
relies on this. This means that the recorded call is always of
the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
(integer or double) index. This is not normally a problem, but it
can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
primitive function that makes use of the call. This means that it
is often safer to call primitive functions with a wrapper, so that
e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
that method dispatch for ‘is.numeric’ occurs correctly.
So it seems that this problem has more to do with $ and how it typically expects unquoted names as the second parameter rather than strings. Look at this example
dflist <- list(
data.frame(a=1:3, z=31:33),
data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist,
function(x, z) {
print(paste("z:",z));
`$`(x,z)
},
z="a"
)
We see the results
[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33
[[2]]
[1] 31 32 33
so the z value is being set to "a", but $ isn't evaluating the second parameter. So it's returning the "z" column rather than the "a" column. This leads to this interesting set of results
a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33
When we call $.data.frame directly we are bypassing the standard deparsing that occurs in the primitive prior to dispatching (which happens near here in the source).
The added catch with lapply is that it passes along arguments to the function via the ... mechanism. For example
lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)
# [[2]]
# FUN(X[[2L]], ...)
This means that when $ is invoked, it deparses the ... to the string "...". This explains this behavior
dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13
Same thing happens when you try to use ... yourself
f<-function(x,...) `$`(x, ...);
f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3
I have the following data frame:
> coc_comp_model[1:3,]
Relationship Output Input |r-Value| Y-Intercept Gradient
1 DG-r ~ DG-cl DG-r DG-cl 0.8271167 0.0027217513 12.9901380
2 CA3-r ~ CA3-cl CA3-r CA3-cl 0.7461309 0.0350767684 27.6107963
3 CA2-r ~ CA2-cl CA2-r CA2-cl 0.9732584 -0.0040992226 35.8299582
I want to create simple functions for each row of the data frame. here's what I've tried:
for(i in 1:nrow(coc_comp_model)) {
coc_glm_f[i] <- function(x)
x*coc_comp_model$Gradient[i] + coc_comp_model$Y-Intercept[i]
}
also tried making a vector of functions, which also does ont work either.
Thanks for reading this/helping.
Something like this:
myfunc<-function(datrow, x){
x*as.numeric(datrow[6]) + as.numeric(datrow[5] )
}
Then you can use apply to call it on each row, changing x as desired:
apply(hzdata, 1, myfunc, x = 0.5)
note: using dput() to share your data is much easier than a pasting in a subset.
There is no such thing as a vector of functions. There are 6 atomic vector types in R: raw, logical, integer, double, complex, and character, plus there is the heterogeneous list type, and finally there is the lesser known expression type, which is basically a vector of parse trees (such as you get from a call to the substitute() function). Those are all the vector types in R.
printAndType <- function(x) { print(x); typeof(x); };
printAndType(as.raw(1:3));
## [1] 01 02 03
## [1] "raw"
printAndType(c(T,F));
## [1] TRUE FALSE
## [1] "logical"
printAndType(1:3);
## [1] 1 2 3
## [1] "integer"
printAndType(as.double(1:3));
## [1] 1 2 3
## [1] "double"
printAndType(c(1i,2i,3i));
## [1] 0+1i 0+2i 0+3i
## [1] "complex"
printAndType(letters[1:3]);
## [1] "a" "b" "c"
## [1] "character"
printAndType(list(c(T,F),1:3,letters[1:3]));
## [[1]]
## [1] TRUE FALSE
##
## [[2]]
## [1] 1 2 3
##
## [[3]]
## [1] "a" "b" "c"
##
## [1] "list"
printAndType(expression(a+1,sum(1,2+3*4),if (T) 1 else 2));
## expression(a + 1, sum(1, 2 + 3 * 4), if (T) 1 else 2)
## [1] "expression"
If you want to store multiple functions in a single object, you have to use a list, and you must use the double-bracket indexing operator in the lvalue to assign to it:
fl <- list();
for (i in 1:3) fl[[i]] <- (function(i) { force(i); function(a) a+i; })(i);
fl;
## [[1]]
## function (a)
## a + i
## <environment: 0x600da11a0>
##
## [[2]]
## function (a)
## a + i
## <environment: 0x600da1ab0>
##
## [[3]]
## function (a)
## a + i
## <environment: 0x600da23f8>
sapply(fl,function(f) environment(f)$i);
## [1] 1 2 3
sapply(fl,function(f) f(3));
## [1] 4 5 6
In the above code I also demonstrate the proper way to closure around a loop variable. This requires creating a temporary function evaluation environment to hold a copy of i, and the returned function will then closure around that evaluation environment so that it can access the iteration-specific i. This holds true for other languages that support dynamic functions and closures, such as JavaScript. In R there is an additional requirement of forcing the promise to be resolved via force(), otherwise, for each generated function independently, the promise wouldn't be resolved until the first evaluation of that particular generated function, which would at that time lock in the current value of the promise target (the global i variable in this case) for that particular generated function. It should also be mentioned that this is an extremely wasteful design, to generate a temporary function for every iteration and evaluate it, which generates a new evaluation environment with a copy of the loop variable.
If you wanted to use this design then your code would become:
coc_glm_f <- list();
for (i in 1:nrow(coc_comp_model)) {
coc_glm_f[[i]] <- (function(i) { force(i); function(x) x*coc_comp_model$Gradient[i] + coc_comp_model$`Y-Intercept`[i]; })(i);
};
However, it probably doesn't make sense to create a separate function for every row of the data.frame. If you intended the x parameter to take a scalar value (by which I mean a one-element vector), then you can define the function as follows:
coc_glm_f <- function(x) x*coc_comp_model$Gradient + coc_comp_model$`Y-Intercept`;
This function is vectorized, meaning you can pass a vector for x, where each element of x would correspond to a row of coc_comp_model. For example:
coc_comp_model <- data.frame(Relationship=c('DG-r ~ DG-cl','CA3-r ~ CA3-cl','CA2-r ~ CA2-cl'),Output=c('DG-r','CA3-r','CA2-r'),Input=c('DG-cl','CA3-cl','CA2-cl'),`|r-Value|`=c(0.8271167,0.7461309,0.9732584),`Y-Intercept`=c(0.0027217513,0.0350767684,-0.0040992226),Gradient=c(12.9901380,27.6107963,35.8299582),check.names=F);
coc_glm_f(seq_len(nrow(coc_comp_model)));
## [1] 12.99286 55.25667 107.48578
I am new to R and have a question on the function posted here: R RStudio Resetting debug / function environment. Why are the objects set to themselves (e.g. "getmean = getmean" etc.)? Couldn't it simply be written as follows: list(set, get, setmean, getmean)
The difference is that
aa <- list(set, get, setmean, getmean)
is an unnamed list and
bb <- list(set=set, get=get, setmean=setmean, getmean=getmean)
is a named list. Compare names(aa) and names(bb).
And that = is not assignment. It's really just giving a label to a list item. It's one of the reasons R programmers try to only use <- for assignment and leave = with this special meaning. You could have easily also done
cc <- list(apple=set, banana=get, ornage=setmean, grape=getmean)
cc$apple()
It doesn't have to be the exact same name.
Because list(set, get, setmean, getmean) won't tag the list elements with the correct names. Here's an example of the difference between tagged and untagged lists:
> list(1, 2, 3)
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
> list(foo=1, bar=2, baz=3)
$foo
[1] 1
$bar
[1] 2
$baz
[1] 3
Note that in the context of argument lists, = is used to supply named arguments, it does not do any assignments (unlike <-). Thus list(foo=1, bar=2, baz=3) is very different from list(foo<-1, bar<-2, baz<-3).
The question has been answered, but you could also do this to achieve the same result.
> object <- c('set', 'get', 'setmean', 'getmean')
> setNames(object = as.list(object), nm = object)
# $set
# [1] "set"
#
# $get
# [1] "get"
#
# $setmean
# [1] "setmean"
#
# $getmean
# [1] "getmean"
The quotations are dependent on what these values actually are.
And you can set different names with like this
> setNames(as.list(object), letters[1:4])
# $a
# [1] "set"
#
# $b
# [1] "get"
#
# $c
# [1] "setmean"
#
# $d
# [1] "getmean"
setNames comes in handy when working with lapply.
Why doesn't this work? or is just the way R works?
Thanks
JJ
a <- c(1,2,3)
b <- 5
lapply(a, function(x) print(x)) # works
lapply(a, function(x,b) print(b)) # doesn't work.
I get --
Error in FUN(c(1, 2, 3)[[1L]], ...) :
argument "b" is missing, with no default
lapply only passes one argument on, because it's only designed to have one argument vary. If you just want to pass extra arguments along, put them as additional options to lapply:
lapply(a, function(x,y) print(y), y=b)
[1] 5
[1] 5
[1] 5
[[1]]
[1] 5
[[2]]
[1] 5
[[3]]
[1] 5
From the lapply help file:
... optional arguments to FUN.
If you want more than one varying argument to be passed to your function, look at mapply.
You could try putting a and b together in a list as follows:
lapply(list(a, b), function(x) print(b))
or specifying an argumant to pass b to as in:
lapply(a, function(x, y=b) print(y))
But I'm not really sure what you're after.