Using apply() over a list - r

I have a list with 100 vectors indexed by:
[[1]]
[1]
[[2]]
[1]
[[3]]
[1]
.
.
.
[[100]]
[1]
Each vector has 3 entries.
I would like to apply a function separately for each of the vectors. The function returns a single number for each vector so the result of the apply() would be a 100 element vector.
How can this be done using apply?
I know I can use apply for matrices by indexing 1 or 2 depending on row or column but can it also be used for lists?

You are looking for sapply:
l <- list(1:3, 2:4, 5:7)
sapply(l, sum)
# [1] 6 9 18
This answer might help you in the future.

Related

Create and apply a function that calculates the length of many strings in a vector

Suppose I have a long vector with characters which is more or less like this:
vec <- c("32, 25", "5", "15, 24")
I want to apply a function which give me the number of strings for any element separated by a comma and returns me a vector with any individual length. Using lapply and my toy vector, this is my approach:
lapply(vec, function(x) {
a <- strsplit(x, ",")
y <- length(a[[1:length(a)]])
unlist(y[1:length(y)])
})
[[1]]
[1] 2
[[2]]
[1] 1
[[3]]
[1] 2
This almost gives me what I want since first element has 2 strings, second element 1 string and third element 2 strings. The problem is I can't achieve that my function returns me a vector of the form c(2,1,2). I'm using this function to create a new variable on some data.frame which I'm working with.
Any idea will be much appreciated.
You could do:
stringr::str_count(vec, ",") + 1
#> [1] 2 1 2
Or, in base R:
nchar(gsub("[^,]", "", vec)) + 1
#> [1] 2 1 2

Function of unlist() when turning one row of a dataframe to a matrix

What is the difference between matrix(unlist(DF[1,])) and matrix(DF[1,]) where DF is my dataframe. How does unlist() help here?
DF[1,] will extract the first row of the data.frame. This row is still a data.frame, a type of list. unlist() will convert it to a vector that can be made into a matrix. If you don't use unlist, the you can still make a matrix, but it is a matrix of the elements of the list, rather than of the elements of a vector. For example,
> cars[1,]
speed dist
1 4 2
> a <- matrix(cars[1,])
> b <- matrix(unlist(cars[1,]))
> a[,1]
[[1]]
[1] 4
[[2]]
[1] 2
> b[,1]
[1] 4 2

Subset different vector elements within a list

Assume I have this list of vectors:
mylist <- list(a=1:3,b=4:1,c=1:5)
mylist
$a
[1] 1 2 3
$b
[1] 4 3 2 1
$c
[1] 1 2 3 4 5
I want to get the last or the max element of each vector like this for the last element:
$a
[1] 3
$b
[1] 1
$c
[1] 5
What I have tried so far:
First use lapply and the length function to get the last element index and then subset:
last <- unlist(lapply(mylist, length))
lapply(mylist,"[", last) # not working
Then I tried to use sapply with lapply. This is working, but I'm not sure whether this is generally valid. There must be a better base R solution (without loops!).
mymatrix <- sapply(last, function(x) lapply(mylist, "[",x))
diag(mymatrix)
$a
[1] 3
$b
[1] 1
$c
[1] 5
(Making this a CV as there were many contributes here and worth summing this up)
If you have some function you want to apply on your list, a simple lapply should do, such as
lapply(mylist, max) # retrieving the maximum values
Or
lapply(mylist, tail, 1) # retrieving the last values (by #docendo)
If you want to operate on two vectors simultaneously, you could use mapply or Map
Map(`[`, mylist, lengths(mylist)) # A Map version of #docendos lapply suggestion
Or per your newest request
Map(`[`, mylist, 1:3)

Apply function to corresponding elements in list of data frames

I have a list of data frames in R. All of the data frames in the list are of the same size. However, the elements may be of different types. For example,
I would like to apply a function to corresponding elements of data frame. For example, I want to use the paste function to produce a data frame such as
"1a" "2b" "3c"
"4d" "5e" "6f"
Is there a straightforward way to do this in R. I know it is possible to use the Reduce function to apply a function on corresponding elements of dataframes within lists. But using the Reduce function in this case does not seem to have the desired effect.
Reduce(paste,l)
Produces:
"c(1, 4) c(\"a\", \"d\")" "c(2, 5) c(\"b\", \"e\")" "c(3, 6) c(\"c\", \"f\")"
Wondering if I can do this without writing messy for loops. Any help is appreciated!
Instead of Reduce, use Map.
# not quite the same as your data
l <- list(data.frame(matrix(1:6,ncol=3)),
data.frame(matrix(letters[1:6],ncol=3), stringsAsFactors=FALSE))
# this returns a list
LL <- do.call(Map, c(list(f=paste0),l))
#
as.data.frame(LL)
# X1 X2 X3
# 1 1a 3c 5e
# 2 2b 4d 6f
To explain #mnel's excellent answer a bit more, consider the simple example of summing the corresponding elements of two vectors:
Map(sum,1:3,4:6)
[[1]]
[1] 5 # sum(1,4)
[[2]]
[1] 7 # sum(2,5)
[[3]]
[1] 9 # sum(3,6)
Map(sum,list(1:3,4:6))
[[1]]
[1] 6 # sum(1:3)
[[2]]
[1] 15 # sum(4:6)
Why the second one is the case might be made more obvious by adding a second list, like:
Map(sum,list(1:3,4:6),list(0,0))
[[1]]
[1] 6 # sum(1:3,0)
[[2]]
[1] 15 # sum(4:6,0)
Now, the next is more tricky. As the help page ?do.call states:
‘do.call’ constructs and executes a function call from a name or a
function and a list of arguments to be passed to it.
So, doing:
do.call(Map,c(sum,list(1:3,4:6)))
calls Map with the inputs of the list c(sum,list(1:3,4:6)), which looks like:
[[1]] # first argument to Map
function (..., na.rm = FALSE) .Primitive("sum") # the 'sum' function
[[2]] # second argument to Map
[1] 1 2 3
[[3]] # third argument to Map
[1] 4 5 6
...and which is therefore equivalent to:
Map(sum, 1:3, 4:6)
Looks familiar! It is equivalent to the first example at the top of this answer.

Paste column values together in a data frame

I am trying to paste together the rowname along with the data in the desired column. I wrote the following code but somehow couldnot find a way to do it correctly.
The desired output will be: "a,1,11" "b,2,22" "c,3,33"
x = data.frame(cbind(f1 = c(1,2,3), f2 = c(5,6,7), f3=c(11,22,33)), row.names= c('a','b','c'))
x
# f1 f2 f3
# a 1 5 11
# b 2 6 22
# c 3 7 33
do.call("paste", c(rownames(x), x[c('f1','f3')], sep=","))
# [1] "a,b,c,1,11" "a,b,c,2,22" "a,b,c,3,33"
Two main points:
Use apply instead of do.call(paste, .)
Use cbind instead of c in this case.
If you would rather use c, you would need to coerce the row names to a list or column first, eg: c(list(rownames(x)), x)
Try the following:
apply(cbind(rownames(x), x[c('f1','f3')]), 1, paste, collapse=",")
a b c
"a,1,11" "b,2,22" "c,3,33"
Your do.call instructs R to paste the list c(rownames(x), x[c('f1','f3')]) together. But take a look at your list.
> c(rownames(x), x[c('f1','f3')])
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
$f1
[1] 1 2 3
$f3
[1] 11 22 33
The c command takes the elements of each argument and joins them together. This properly deconstructs x[c('f1','f3')] but also deconstructs rownames(x) in a way you don't want. Obeying the standard recycling rule, paste then takes an item from each list element and patches them together with sep=",".
You could fix this by encapsulating rownames(x) inside a list structure so that your list of arguments comes out properly:
do.call("paste", c(list(rownames(x)), x[c('f1','f3')], sep=","))
No need for do.call or apply:
paste(rownames(x),x[[1]],x[[3]] , sep=",")
[1] "a,1,11" "b,2,22" "c,3,33"

Resources