Is R's list() function actually creating a nested list? - r

R may have its own loigc but list() did not give me what I expected.
l1 <- list(1,2)
$> l1
[[1]]
[1] 1
[[2]]
[1] 2
To retrieve the element, I need to use double-bracket, i.e.,
$> l1[[1]]
[1] 1
$> class(l1[[1]])
"numeric"
Single-bracket gives me, a sub-list (which is also a list object):
$> l1[1]
l1[[1]]
[1] 1
$> class(l1[1])
"list"
I am not saying this is wrong; this isn't what I expected because I was trying to create a 1-dimensional list whereas what I actually get is a nested list, a 2-dimensional object.
What is the logic behind this behaviour and how do we create an OO type list? i.e., a 1-dimensional data structure?
The behaviour I am expecting, with a 1 dimensional data structure, is:
$> l1[1]
[1] 1
$> l1[2]
[2] 2

If you want to create a list with the two numbers in one element, you are looking for this:
l1 <- list(c(1, 2))
l1
#> [[1]]
#> [1] 1 2
Your code basically puts two vectors of length 1 into a list. To make R understand that you have one vector, you need to combine (i.e., c()) the values into a vector first.
This probably becomes clearer when we create the two vectors as objects first:
v1 <- 1
v2 <- 2
l2 <- list(v1, v2)
l2
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
If you simply want to store the two values in an object, you want a vector:
l1 <- c(1, 2)
l1
#> [1] 1 2
For more on the different data structures in R I recommend this chapter: http://adv-r.had.co.nz/Data-structures.html
For the question about [ and [[ indexing, have a look at this classic answer: https://stackoverflow.com/a/1169495/5028841

Related

Why does as.list() applied to a vector generate a list that is not treated the same as a list generated with list() in R?

Here is a very basic example that illustrates the differences in R
Given the following data frames:
a <- data.frame(l=c("object1", "object2"))
b <- data.frame(l=c("object3", "object4"))
Creating a vector for the names of the data frames:
vector <- c("a","b")
And then applying as.list()
list_of_vector <- as.list(vector)
If we try loop this:
lapply(list_of_vector, print)
The output is
[1] "a"
[1] "b"
[[1]]
[1] "a"
[[2]]
[1] "b"
Compared to just manually creating a list and then running the same loop:
straight_list <- list(a,b)
lapply(straight_list, print)
l
1 object1
2 object2
l
1 object3
2 object4
[[1]]
l
1 object1
2 object2
[[2]]
l
1 object3
2 object4
I would like to understand what makes as.list() different from list and how I would be able to convert a vector like the above to create the 2nd, rather than first output. Thanks in advance :)

Weird behavior when trying to sort vectors in a list using a loop

I want to sort vectors in a list. I tried the following:
test <- list(c(2,3,1), c(3,2,1), c(1,2,3))
for (i in length(test)){
test[[i]] <- sort(test[[i]])
}
test
Which returns the list unchanged (vectors not sorted):
[[1]]
[1] 2 3 1
[[2]]
[1] 3 2 1
[[3]]
[1] 1 2 3
However when I sort manually outside the loop the order is stored:
test[[1]]
[1] 2 3 1
test[[1]] <- sort(test[[1]])
test[[1]]
[1] 1 2 3
Why does the behaviour in the loop differ? I would expect the loop to store three vectors c(1,2,3) in the list. What am I missing?
I just figured the loop only loops over one element since length(test) = 3. Hence I should have used for (i in 1:length(test)).

Deconstruct DNAstringsSets into normal strings

This comes from an R library called "VariantAnnotation" and its dependency "Biostrings"
I have a DNAstringsSetList and I want to transform it into a normal list or a vector of strings.
library(VariantAnnotation)
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
vcf <- readVcf(fl, "hg19")
tempo <- rowRanges(vcf)$ALT # Here is the DNAstringsSetList I mean.
print(tempo)
A DNAStringSet instance of length 10376
width seq
[1] 1 G
[2] 1 T
[3] 1 A
[4] 1 T
[5] 1 T
... ... ...
[10372] 1 G
[10373] 1 G
[10374] 1 G
[10375] 1 A
[10376] 1 C
tempo[[1]]
A DNAStringSet instance of length 1
width seq
[1] 1 G
But I don't want this format. I just want strings of the bases, in order to insert them as a column in a new dataframe. I want this:
G
T
A
T
T
I have accomplished this with this package method:
as.character(tempo#unlistData)
However, it returns 10 rows more than tempo has! The head and tail of this result and of tempo are exactly the same, so somewhere in the middle there are 10 extra rows that should not have been formed (not NAs)
You can call as.character on either a DNAString or a DNAStringSet.
as.character(tempo[1 : 5])
# [1] "G" "T" "A" "T" "T"
A simple loop solves the issue, using the toString function of the same library:
ALT <-0
for (i in 1:nrow(vcf)){ ALT[i] <- toString(tempo[[i]]) }
However, I have no idea why tempo#unlistData retrieves too many rows. It is not trustworthy.

Apply function to corresponding elements in list of data frames

I have a list of data frames in R. All of the data frames in the list are of the same size. However, the elements may be of different types. For example,
I would like to apply a function to corresponding elements of data frame. For example, I want to use the paste function to produce a data frame such as
"1a" "2b" "3c"
"4d" "5e" "6f"
Is there a straightforward way to do this in R. I know it is possible to use the Reduce function to apply a function on corresponding elements of dataframes within lists. But using the Reduce function in this case does not seem to have the desired effect.
Reduce(paste,l)
Produces:
"c(1, 4) c(\"a\", \"d\")" "c(2, 5) c(\"b\", \"e\")" "c(3, 6) c(\"c\", \"f\")"
Wondering if I can do this without writing messy for loops. Any help is appreciated!
Instead of Reduce, use Map.
# not quite the same as your data
l <- list(data.frame(matrix(1:6,ncol=3)),
data.frame(matrix(letters[1:6],ncol=3), stringsAsFactors=FALSE))
# this returns a list
LL <- do.call(Map, c(list(f=paste0),l))
#
as.data.frame(LL)
# X1 X2 X3
# 1 1a 3c 5e
# 2 2b 4d 6f
To explain #mnel's excellent answer a bit more, consider the simple example of summing the corresponding elements of two vectors:
Map(sum,1:3,4:6)
[[1]]
[1] 5 # sum(1,4)
[[2]]
[1] 7 # sum(2,5)
[[3]]
[1] 9 # sum(3,6)
Map(sum,list(1:3,4:6))
[[1]]
[1] 6 # sum(1:3)
[[2]]
[1] 15 # sum(4:6)
Why the second one is the case might be made more obvious by adding a second list, like:
Map(sum,list(1:3,4:6),list(0,0))
[[1]]
[1] 6 # sum(1:3,0)
[[2]]
[1] 15 # sum(4:6,0)
Now, the next is more tricky. As the help page ?do.call states:
‘do.call’ constructs and executes a function call from a name or a
function and a list of arguments to be passed to it.
So, doing:
do.call(Map,c(sum,list(1:3,4:6)))
calls Map with the inputs of the list c(sum,list(1:3,4:6)), which looks like:
[[1]] # first argument to Map
function (..., na.rm = FALSE) .Primitive("sum") # the 'sum' function
[[2]] # second argument to Map
[1] 1 2 3
[[3]] # third argument to Map
[1] 4 5 6
...and which is therefore equivalent to:
Map(sum, 1:3, 4:6)
Looks familiar! It is equivalent to the first example at the top of this answer.

Extract vectors from elements of list of vectors

I have some json data [{a:10, b:123,c:4.5},{a:2,b:5,c:33}] and so on that I read into R via json_data <- fromJSON(paste(json_file, collapse="")) (json_file is the input url). So far so fine.
Now I would like to create vectors from this input which fromJSON has converted into a List of vectors where the vectors have components a,b,c.
Is there a better way than looping over the input list and doing this manually by concatenating the individual vector components on the new target vector(s)?
If you have a list like this:
l <- list(c(a=10, b=123, c=4.5),c(a=2,b=5,c=33))
You could just do something like the following:
df <- data.frame(do.call(rbind, l))
# a b c
# 1 10 123 4.5
# 2 2 5 33.0
as.list(df)
# $a
# [1] 10 2
# $b
# [1] 123 5
# $c
# [1] 4.5 33.0
(The do.call(rbind, X) construct is handy, allowing you to rbind together the elements of a list of arbitrary length. You can then slice and dice the resulting matrix as you see fit --- I just converted it to a data.frame and then to a list to show a couple of possibilities.)

Resources