This question already has answers here:
The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe
(11 answers)
Closed 8 years ago.
My question stems from the usage of [[ and ]] in user created functions to reference list elements. From what I can tell, [[ and ]] work the same way as [ and ] when applied to vectors.
Is this true of all other list operations though? As another example, I can use lapply on a vector.
It makes sense that this is true if a list is just a generalised vector, whose entries can be of differing modes.
EDIT: The one-and-a-half line answer is that both lists and atomic vectors are types of vectors, and subset exactly the same way.
This answer expands on the difference between lists and atomic vectors.
The best explanation of R's data structures, specifically between lists and atomic vectors, is (in my opinion) Hadley Wickham's new book:
http://adv-r.had.co.nz/Data-structures.html
Both lists and atomic vectors are 1 dimensional data structures. However, atomic vectors are homogeneous and lists are heterogeneous. Lists can contain any type of vector, including other lists. Atomic vectors are flat on the other hand.
As far as subsetting using [] vs [[]], [] is preserving for both lists and atomic vectors, where as [[]] is simplifying. Thus, [] and [[]] are NOT the same, whether applied to lists OR atomic vectors. For example, [[]] will simplify a named vector by removing the name; subsetting a named vector by [] will keep the name. For a list, [[]] will pull out the contents of a list, and can return a number of simplified data structures. Subsetting a list by [] will always return a list (preserving).
Subsetting an atomic vector by [[]] returns a length one atomic vector. Subsetting a list by [[]] can return a number of different classes of data structures. This goes back to the fact that atomic vectors are homogeneous and lists are heterogeneous. However, according to Hadley, subsetting a list works exactly the same way as subsetting an atomic vector.
Take a look at this section of Hadley's book for further reference:
http://adv-r.had.co.nz/Subsetting.html#subsetting-operators
Since I wasn't able to come up with any more counter examples, I referred to the documentation on R's internals, and it appears your intuition is correct.
If you look at the section on the underlying structure of R's data structures in C,
SEXPTYPEs, lists are implied to be generic vectors:
19 VECSXP list (generic vector)
Related
Title essentially says it all. I'm having trouble figuring out the difference between initializing a vector with vector(mode="list") and a list with list().
There are some minor differences in the signatures, list() can take value arguments or tag = value arguments whereas vector() cannot.
And then there's the following quote from the list() documentation:
Almost all lists in R internally are Generic Vectors
So is there any actual difference beside the fact that lists can be initialized with tags and values?
I'd say they're the same:
identical(list(),vector(mode="list", length=0))
## [1] TRUE
(see also this question about the confusing fact that a list is a vector in R: usually when R users refer to "vectors", they actually mean atomic vectors ...)
In my experience the most common use case for vector(mode="list",...) is when you want to initialize a list with length>0. vector(mode="list",10) might be a little more expressive than replicate(10,NULL). If you want to create a length-0 list I can't see any reason to use vector() instead of list().
I have two lists of matrices and I want to multiply the first element of the first list with the first element of the second list and so on, without writing every operatios due to may be a large number of elements on each list (both lists have the same length)
this is what I mean
'(colSums(R1*t(M1))),(colSums(R2*t(M2))),...(colSums(Rn*t(Mn)))'
Do I need to create an extra list?
Although first I must be able to transpose the matrices of one of the lists before multiplying them. The results will be used for easier operations.
I already tried to use indexes and loops and doesn't work,
first tried to transpose matrices in one list like this (M is one of the lists and the other is named R, M contains M1,M2,..Mn and the same for list R)
The complete operation looks like this:
'for (i in 1:length(M)){Mt<-list(t(M[[i]]))}'
and only applies it to the last element.
The full operation looks like this:
'(cbind((colSums(R1*t(M1))),(colSums(R2*t(M2))),...(colSums(Rn*t(Mn))))'
any step of these will be useful
you could use the rlist package.
The function
list.apply(.data, .fun, ...)
will apply a function to each list element.
You can find documentation at [https://cran.r-project.org/web/packages/rlist/rlist.pdf][1].
There are MANY posts about indexing lists, but I still can't quite get my head around indexing methods for named and unnamed nested lists. Here's my example
person <- list("name"="John","age"=19,"speaks"=c("English","French"))
Johns_brother <- list("name"="Sam","age"=20,"speaks"=c("English","Spanish"))
Johns_sister <- list("name"="Minerva","age"=17,"speaks"=c("English","Italian"))
Johns_sister <- list("name"="Minerva","age"=17,"speaks"=c("English","Italian"))
Johns_other_sister <- list("name"="Casandra","age"=23,"speaks"=c("English","Greek"))
person <- list("name"="John","age"=19,"speaks"=c("English","French"),"siblings"=list(Johns_brother,Johns_sister,Johns_other_sister))
Both of these indexing methods return lists
class(person$siblings[1])
class(person$siblings[[1]])
But only the second allows me to select named elements
person$siblings[1]$name
person$siblings[[1]]$name
Now I've seen posts that insist (all caps in the original) "A DOUBLE BRACKET WILL NEVER RETURN A LIST. RATHER A DOUBLE BRACKET WILL RETURN ONLY A SINGLE ELEMENT FROM THE LIST" But that's obviously not true since both indexing methods return lists. But the two forms of brackets are returning DIFFERENT lists, right? What is the underlying logic here?
Think about it. The [[ notation indexes the list element. But what if that element itself is a list?
list(a = list(b = 1))[[1]]
# $b
# [1] 1
In the above example, the return value is still a list because a is a list. The value returned depends on the value being indexed. The statement A DOUBLE BRACKET WILL NEVER RETURN A LIST is simply not true.
Help on this can be found in help(Extract) -
Indexing by [ is similar to atomic vectors and selects a list of the specified element(s).
Both [[ and $ select a single element of the list.
It also helps to know the difference between atomic and recursive (list-like) vectors.
I'm needing to subset a list which contains an array as well as a factor variable. Essentially if you imagine each component of the array is relative to a single individual which is then associated to a two factor variable (treatment).
list(array=array(rnorm(2,4,1),c(5,5,10)), treatment= rep(c(1,2),5))
Typically when sub-setting multiple components of the array from the first component of the list I would use something like
list$array[,,c(2,4,6)]
this would return the array components in location 2,4 and 6. However, for the factor component of the list this wouldn't work as subsetting is different, what you would need is this:
list$treatment[c(2,4,6)]
Need to subset a list with containing different classes (array and vector) by the same relative number.
You're treating your list of matrices as some kind of 3-dimensional object, but it's not.
Your list$matrices is of itself a list as well, which means you can index at as a list as well, it doesn't matter if it is a list of matrices, numerics, plot-objects, or whatever.
The data you provided as an example can just be indexed at one level, so list$matrices[c(2,4,6)] works fine.
And I don't really get your question about saving the indices in a numeric vector, what's to stop you from this code?
indices <- c(2,4,6)
mysubset <- list(list$matrices[indices], list$treatment[indices])
EDIT, adding new info for edited question:
I see you actually have an 3-D array now. Which is kind of weird, as there is no clear convention of what can be seen as "components". I mean, from your question I understand that list$array[,,n] refers to the n-th individual, but from a pure code-point of view there is no reason why something like list$array[n,,] couldn't refer to that.
Maybe you got the idea from other languages, but this is not really R-ish, your earlier example with a list of matrices made more sense to me. And I think the most logical would have been a data.frame with columns matrix and treatment (which is conceptually close to a list with a vector and a list of matrices, but it's clearer to others what you have).
But anyway, what is your desired output?
If it's just subsetting: with this structure, as there are no constraints on what could have been the content, you just have to tell R exactly what you want. There is no one operator that takes a subset of a vector and the 3rd index of an array at the same time. You're going to have to tell R that you want 3rd index to use for subsetting, and that you want to use the same index for subsetting a vector. Which is basically just the code you already have:
idx <- c(2,4,6)
output <- list(list$array[,,idx], list$treatment[idx])
The way that you use for subsetting multiple matrices actually gives an error since you are giving extra dimension although you already specify which sublist you are in. Hence in order to subset matrices for the given indices you can usemy_list[[1]][indices] or directly my_list$matrices[indices]. It is the same for the case treatement my_list[[2]][indices] or my_list$treatement[indices]
I learnt that a vector is a sequence of data elements of the same basic type. Then what will we call a in the following code (as it contains both numeric and charater):
a = c(1,"b")
is.vector(a)
[1] TRUE
So is the definition of vector wrong? I referred this tutorial.
The tutorial simplifies and that can cause confusion. Its definition describes "basic vector types", but there are also "generic vectors".
From the language definition (which you should study):
2.1.1 Vectors
Vectors can be thought of as contiguous cells containing data. Cells
are accessed through indexing operations such as x[5]. More details
are given in Indexing.
R has six basic (‘atomic’) vector types: logical, integer, real,
complex, string (or character) and raw. The modes and storage modes
for the different vector types are listed in the following table.
typeof mode storage.mode
logical logical logical
integer numeric integer
double numeric double
complex complex complex
character character character
raw raw raw
Single numbers, such as 4.2,
and strings, such as "four point two" are still vectors, of length 1;
there are no more basic types. Vectors with length zero are possible
(and useful).
2.1.2 Lists
Lists (“generic vectors”) are another kind of data storage. Lists have
elements, each of which can contain any type of R
object, i.e. the elements of a list do not have to be of the same
type. List elements are accessed through three different indexing
operations. These are explained in detail in Indexing.
Lists are vectors, and the basic vector types are referred to as
atomic vectors where it is necessary to exclude lists.
From help("is.vector"):
If mode = "any", is.vector may return TRUE for the atomic modes, list
and expression. For any mode, it will return FALSE if x has any
attributes except names. [...]
(An expression is basically a list.)
Note that factors are not vectors; is.vector returns FALSE and as.vector converts a factor to a character vector for mode = "any".
Finally, as #Henrik points out, c coerces all arguments to the same type.
Actually, in your example, the "1" will be viewed as a character by R.
a<-c(1,"b")
typeof(a[1])
[1] "character"