Why doesn't this work? or is just the way R works?
Thanks
JJ
a <- c(1,2,3)
b <- 5
lapply(a, function(x) print(x)) # works
lapply(a, function(x,b) print(b)) # doesn't work.
I get --
Error in FUN(c(1, 2, 3)[[1L]], ...) :
argument "b" is missing, with no default
lapply only passes one argument on, because it's only designed to have one argument vary. If you just want to pass extra arguments along, put them as additional options to lapply:
lapply(a, function(x,y) print(y), y=b)
[1] 5
[1] 5
[1] 5
[[1]]
[1] 5
[[2]]
[1] 5
[[3]]
[1] 5
From the lapply help file:
... optional arguments to FUN.
If you want more than one varying argument to be passed to your function, look at mapply.
You could try putting a and b together in a list as follows:
lapply(list(a, b), function(x) print(b))
or specifying an argumant to pass b to as in:
lapply(a, function(x, y=b) print(y))
But I'm not really sure what you're after.
Related
Why don't lambda functions handle replacement functions in their natural form? For example, consider the length<- function. Say I want to standardize the lengths of a list of objects, I may do something like:
a <- list(c("20M1", "A1", "ACC1"), c("20M2", "A2", "ACC2"), c("20M3"))
mx <- max(lengths(a))
lapply(a, `length<-`, mx)
#> [[1]]
#> [1] "20M1" "A1" "ACC1"
#>
#> [[2]]
#> [1] "20M2" "A2" "ACC2"
#>
#> [[3]]
#> [1] "20M3" NA NA
However if I wanted to specify the argument input locations explicitly using a lambda function I'd need to do (which also works):
lapply(a, function(x) `length<-`(x, mx))
But why doesn't the more intuitive notation for replacement functions (see below) work?
lapply(a, function(x) length(x) <- mx)
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 3
#>
#> [[3]]
#> [1] 3
This returns an output I did not expect. What is going on here? Lambda functions seem to handle the intuitive form of infix functions, so I was a little surprised they don't work with the intuitive form of replacement functions. Why is this / is there a way to specify replacement functions in lambda functions using their intuitive form?
(I imagine it has something to do with the special operator <-... but would be curious for a solution or more precise explanation).
Whenever you do an assignment in R, the value returned from that expression is the right hand side value. This is true even for "special" versions of assign functions. For example if you do this
x <- 1:2; y <- (names(x) <- letters[1:2])
> y
[1] "a" "b"
You can see that y gets the values of the names, not the updated value of x.
In your case if you want to return the updated value itself, you need to do so explicitly
lapply(a, function(x) {length(x) <- mx; x})
Let's say I have a list of data.frames
dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))
If i want to extract the first column from each item in the list, I can do
lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 10 11 12
Why can't I use the "$" function in the same way
lapply(dflist, `$`, "a")
# [[1]]
# NULL
#
# [[2]]
# NULL
But these both work:
lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")
I realize that in this case one could use
lapply(dflist, `[[`, "a")
but I was working with an S4 object that didn't seem to allow indexing via [[. For example
library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)
#works
catpop[[1]]$tab
#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) :
# no slot of name "..." for this object of class "genpop"
#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable
For the first example, you can just do:
lapply(dflist, `$.data.frame`, "a")
For the second, use the slot() accessor function
lapply(mylist, "slot", "tab")
I'm not sure why method dispatch doesn't work in the first case, but the Note section of ?lapply does address this very issue of its borked method dispatch for primitive functions like $:
Note:
[...]
For historical reasons, the calls created by ‘lapply’ are
unevaluated, and code has been written (e.g., ‘bquote’) that
relies on this. This means that the recorded call is always of
the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
(integer or double) index. This is not normally a problem, but it
can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
primitive function that makes use of the call. This means that it
is often safer to call primitive functions with a wrapper, so that
e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
that method dispatch for ‘is.numeric’ occurs correctly.
So it seems that this problem has more to do with $ and how it typically expects unquoted names as the second parameter rather than strings. Look at this example
dflist <- list(
data.frame(a=1:3, z=31:33),
data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist,
function(x, z) {
print(paste("z:",z));
`$`(x,z)
},
z="a"
)
We see the results
[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33
[[2]]
[1] 31 32 33
so the z value is being set to "a", but $ isn't evaluating the second parameter. So it's returning the "z" column rather than the "a" column. This leads to this interesting set of results
a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33
When we call $.data.frame directly we are bypassing the standard deparsing that occurs in the primitive prior to dispatching (which happens near here in the source).
The added catch with lapply is that it passes along arguments to the function via the ... mechanism. For example
lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)
# [[2]]
# FUN(X[[2L]], ...)
This means that when $ is invoked, it deparses the ... to the string "...". This explains this behavior
dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13
Same thing happens when you try to use ... yourself
f<-function(x,...) `$`(x, ...);
f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3
Let's say I have a list of data.frames
dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))
If i want to extract the first column from each item in the list, I can do
lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 10 11 12
Why can't I use the "$" function in the same way
lapply(dflist, `$`, "a")
# [[1]]
# NULL
#
# [[2]]
# NULL
But these both work:
lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")
I realize that in this case one could use
lapply(dflist, `[[`, "a")
but I was working with an S4 object that didn't seem to allow indexing via [[. For example
library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)
#works
catpop[[1]]$tab
#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) :
# no slot of name "..." for this object of class "genpop"
#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable
For the first example, you can just do:
lapply(dflist, `$.data.frame`, "a")
For the second, use the slot() accessor function
lapply(mylist, "slot", "tab")
I'm not sure why method dispatch doesn't work in the first case, but the Note section of ?lapply does address this very issue of its borked method dispatch for primitive functions like $:
Note:
[...]
For historical reasons, the calls created by ‘lapply’ are
unevaluated, and code has been written (e.g., ‘bquote’) that
relies on this. This means that the recorded call is always of
the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
(integer or double) index. This is not normally a problem, but it
can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
primitive function that makes use of the call. This means that it
is often safer to call primitive functions with a wrapper, so that
e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
that method dispatch for ‘is.numeric’ occurs correctly.
So it seems that this problem has more to do with $ and how it typically expects unquoted names as the second parameter rather than strings. Look at this example
dflist <- list(
data.frame(a=1:3, z=31:33),
data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist,
function(x, z) {
print(paste("z:",z));
`$`(x,z)
},
z="a"
)
We see the results
[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33
[[2]]
[1] 31 32 33
so the z value is being set to "a", but $ isn't evaluating the second parameter. So it's returning the "z" column rather than the "a" column. This leads to this interesting set of results
a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33
When we call $.data.frame directly we are bypassing the standard deparsing that occurs in the primitive prior to dispatching (which happens near here in the source).
The added catch with lapply is that it passes along arguments to the function via the ... mechanism. For example
lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)
# [[2]]
# FUN(X[[2L]], ...)
This means that when $ is invoked, it deparses the ... to the string "...". This explains this behavior
dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13
Same thing happens when you try to use ... yourself
f<-function(x,...) `$`(x, ...);
f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3
I am currently trying to check if a list(containing multiple vectors filled with values) is equal to a vector. Unfortunately the following functions did not worked for me: match(), any(), %in%. An example of what I am trying to achieve is given below:
Lets say:
lists=list(c(1,2,3,4),c(5,6,7,8),c(9,7))
vector=c(1,2,3,4)
answer=match(lists,vector)
When I execute this it does return False values instead of a positive result. When I compare a vector with a vector is working but when I compare a vector with a list it seems that it can not work properly.
I would use intersect, something like this :
lapply(lists,intersect,vector)
[[1]]
[1] 1 2 3 4
[[2]]
numeric(0)
[[3]]
numeric(0)
I'm not completely sure what you want the result to be (for example do you care about vector order?) but regardless you'll need to think about lapply. For example,
##Create some data
R> lists=list(c(1,2,3,4),c(5,6,7,8),c(9,7))
R> vector=c(1,2,3,4)
then we use lapply to go through each list element and apply a function. In this case, I've used the match function (since you mentioned that in your question):
R> lapply(lists, function(i) all(match(i, vector)))
[[1]]
[1] TRUE
[[2]]
[1] NA
[[3]]
[1] NA
It's probably worth converting to a vector, so
R> unlist(lapply(lists, function(i) all(match(i, vector))))
[1] TRUE NA NA
and to change NA to FALSE, something like:
m = unlist(lapply(lists, function(i) all(match(i, vector))))
m[is.na(m)] = FALSE
Consider the following situation where I have a list of n matrices (this is just dummy data in the example below) in the object myList
mat <- matrix(1:12, ncol = 3)
myList <- list(mat1 = mat, mat2 = mat, mat3 = mat, mat4 = mat)
I want to select a specific column from each of the matrices and do something with it. This will get me the first column of each matrix and return it as a matrix (lapply() would give me a list either is fine).
sapply(myList, function(x) x[, 1])
What I can't seem able to do is use [ directly as a function in my sapply() or lapply() incantations. ?'[' tells me that I need to supply argument j as the column identifier. So what am I doing wrong that this does't work?
> lapply(myList, `[`, j = 1)
$mat1
[1] 1
$mat2
[1] 1
$mat3
[1] 1
$mat4
[1] 1
Where I would expect this:
$mat1
[1] 1 2 3 4
$mat2
[1] 1 2 3 4
$mat3
[1] 1 2 3 4
$mat4
[1] 1 2 3 4
I suspect I am getting the wrong [ method but I can't work out why? Thoughts?
I think you are getting the 1 argument form of [. If you do lapply(myList, `[`, i =, j = 1) it works.
After two pints of Britain's finest ale and a bit of cogitation, I realise that this version will work:
lapply(myList, `[`, , 1)
i.e. don't name anything and treat it like I had done mat[ ,1]. Still don't grep why naming j doesn't work...
...actually, having read ?'[' more closely, I notice the following section:
Argument matching:
Note that these operations do not match their index arguments in
the standard way: argument names are ignored and positional
matching only is used. So ‘m[j=2,i=1]’ is equivalent to ‘m[2,1]’
and *not* to ‘m[1,2]’.
And that explains my quandary above. Yeah for actually reading the documentation.
It's because [ is a .Primitive function. It has no j argument. And there is no [.matrix method.
> `[`
.Primitive("[")
> args(`[`)
NULL
> methods(`[`)
[1] [.acf* [.AsIs [.bibentry* [.data.frame
[5] [.Date [.difftime [.factor [.formula*
[9] [.getAnywhere* [.hexmode [.listof [.noquote
[13] [.numeric_version [.octmode [.person* [.POSIXct
[17] [.POSIXlt [.raster* [.roman* [.SavedPlots*
[21] [.simple.list [.terms* [.ts* [.tskernel*
Though this really just begs the question of how [ is being dispatched on matrix objects...