Create a list of integer vectors - r

let us say I want to create a list where each element is an integer vector.
Let us say I have,
a = c(1,2,3,4)
b = c(7,9,10,3)
d = c(90.2,43.1,54.2,12.3)
And I'd like a list where element 1 of the list is:
c(1,7,90.2)
Second element is,
c(2,9,43.1),
The third element is,
c(3,10,54.2),
and the 4th element is,
c(4,3,12.3).
If I do,
my.list = list(a=a,b=b,d=d)
gives me the transpose of what I want. Is there any direct way to achieve such goal?
I need to have a list because I want to use the mclapply function and that function only takes lists as input, or (if given dataframes) will coerce them to the non desired list structure.
Note that in my program this vectors are quite huge, 400 million entries or so. I am looking for a very fast and efficient way to do this. Thanks!

kk<-Map(function(x,y,w) c(x,y,w),a,b,d)
> kk[1]
[[1]]
[1] 1.0 7.0 90.2
Or just :
kk<-Map(`c`,a,b,d)
> kk[1]
[[1]]
[1] 1.0 7.0 90.2

Turn your vectors into one data.frame:
adf<-data.frame(a=a, b=b, d=d)
Then use apply to turn each row into a list element:
apply(adf, 1, function(x) list(x))

Related

How to get access to "str_match_all" results in R?

Just used "str_match_all" as follows:
a <- str_match_all(dd, '\\d+(\\w+)')`
and obtained the following:
#[[1]]
# [,1] [,2]
#[1,] "12hours" "hours"
#[2,] "23days" "days"
How can I access each string?
I have tried a[1][,1] to access the first column for example but I get an error saying the number of dimensions is not correct.
If I understand your problem correctly, you are having trouble accessing each individual element.
I think you have to remember that your output is a list and the element in that list is a matrix. Therefore to access each individual element you first have to invoke which element of the list you are interested in and then the row and then the column.
a[[1]][1,2]
So in your case, this will access the first element in your list (looks like you only have 1), and then the 1st row and then the 2nd column so it will give you, "hours".
If however, you're more used to working with dataframes as I assume that is your end goal, I would approach this programmatically as follows:
Taking an example from the str_match_all() documentation
# Creating reproduceable example
strings <- c("Home: 219 733 8965. Work: 229-293-8753 ",
"banana pear apple", "595 794 7569 / 387 287 6718")
phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"
Your goal is to convert the matrix in to a data frame. Which you do as follows
as.data.frame(a[[1]])
For future reference, lets say your output is more than 1 element as is the case in this example, you should approach the solution like so:
# Make a function that accepts your list variable.
# Copy and paste the step before and then add an extra step using dplyr::bind_rows()
output_to_df <- function(x){
a <- as.data.frame(x)
bind_rows(a)
}
# Using this function we will then use map_dfr()
# so that we can apply our premade function on all elements
# of our list no matter how many elements it contains
str_output <- map_df(a, output_to_df)
You can now reuse your output_to_df() function as many times as you need.

Looping/printing over a list in R

Let's suppose I have a simple list
v <- list(vec1=c(1,2,3), vec2=c(3,4,5, 6))
I would like to loop over this list and perform some function on its element, so that as an output I get both: name of that particular element and output of the function. For example:
for (i in v)
{print(sd(i))
}
In this case, the output is:
[1] 1
[1] 1.290994
But I would like to see something like this:
$vec1
[1] 1
$vec2
[1] 1.290994
So that I can easily spot to which element each output refers, if I have many elements within my list. I know it has sth with the function names() to do, but I can't make it work.
Using the function names() and outputting a list:
result<-list()
for (i in 1:length(v))
{result[[i]]=sd(v[[i]])
}
names(result)<-names(v)
The downside of this method is that it will assign the wrong names if the resulting list is smaller or greater than the original list (for example, if you add a next statement on the loop or otherwise skip an element).
If possible, a much easier solution is to follow d.b's comment.

Make a new vector using elements of list and another vector interchengeably

I have a list with 20 elements each contains a vector of 2 numbers. I have also generated a sequence of numbers (20). Now I would like to construct 1 long vector that would first list the elements of intervals[[1]] and the first element of newvals[1], later intervals[[2]], newvals[2] etc etc
Help will be much appreciated. I think plyr package might be helpful although I am not sure how to structure it. help will be much appreciated!
s1 <- seq(0, 1, by = 0.05)
intervals <- Map(c, s1[-length(s1)], s1[-1])
intervals[[length(intervals)]][2] <- intervals[[length(intervals)]][2]+0.1
newvals <- seq(1,length(intervals),1)
#### HERE I WOULD LIKE TO HAVE A VECTOR IN THE FOLLOWING PATTERN
####UP TO THE LAST ELEMENT OF THE LIST:
stringreclass <- c(intervals[[1]],newvals[1]), .... , intervals[[20]],newvals[20])

R select names from a list using logical vector

Suppose I have my list with names and its component and I want to get those names which have its components in other vector:
that is my list neighbors
neighbors[[1]]
[1] "CNBP" "IGF2BP1" "RPL3|OK/SW-cl.32"
[4] "HNRNPC" "PURA|hCG_45299" "RPS3A"
"Cnbp" "Mis12|DN-393H17.5"
neighbors[[2]]
[1] "NIN" "PRKACA" "AURKA|RP5-1167H4.6"
[4] "GSK3B" "AMOT" "UBC"
and my vector of interest
mtop
[1] "TUBA1A" "DNAJB1" "MME"
[4] "PRKCB" "PARK2|KB-152G3.1" "UBC"
My idea for example is return the name of neighbors[2], which have in common UBC
Any ideas??
First off, your data. Your output appears sonewhat strange. If this is not what you have, consider using dput to dump these variables in a reproducible way.
mtop <- c("TUBA1A", "DNAJB1", "MME",
"PRKCB", "PARK2|KB-152G3.1", "UBC")
neighbors <- list(c("CNBP", "IGF2BP1", "RPL3|OK/SW-cl.32",
"HNRNPC", "PURA|hCG_45299", "RPS3A",
"Cnbp", "Mis12|DN-393H17.5"),
c("NIN", "PRKACA", "AURKA|RP5-1167H4.6",
"GSK3B", "AMOT", "UBC"))
To select those elements of list neighbors which have at least one vector element in common with mtop, you can use this command:
matching <- sapply(neighbors, function(l) length(intersect(mtop, l)) > 0)
print(neighbors[matching])
This will print neighbors[2], as it has "UBC" in common with mtop. It does this via the logical vector matching. Which seems to be what your question asked.
If you want to take position into account, i.e. only select neighbors[2] because "UBC" is in position 6 in both vectors, then you should use this command:
matching <- sapply(neighbors, function(l) any(l == mtop))
However, this will create a warning, as neighbors[[1]] is longer than mtop.
If you want the names common to both your data structures, you can use this code:
intersect(unlist(neighbors), mtop)
If you need something else, you have to be more specific in your question, i.e. give an explicit example of what the output should look like, and cover all the possible input configurations that might lead to structurally different output.
How about:
l<- lapply(neighbours,function(x)x[x %in% mtop])
This will return the list where each list element will have the elements which are in the vector mtop.
Now select only those elements which have non-zero length:
names(l)[sapply(l,length)>0]
You can combine these into one line:
names(neighbours)[sapply(neighbours,function(x)Reduce("|",mtop %in% x))]

R colon operator on list of matrices

I've created a list of matrices in R. In all matrices in the list, I'd like to "pull out" the collection of matrix elements of a particular index. I was thinking that the colon operator might allow me to implement this in one line. For example, here's an attempt to access the [1,1] elements of all matrices in a list:
myList = list() #list of matrices
myList[[1]] = matrix(1:9, nrow=3, ncol=3, byrow=TRUE) #arbitrary data
myList[[2]] = matrix(2:10, nrow=3, ncol=3, byrow=TRUE)
#I expected the following line to output myList[[1]][1,1], myList[[2]][1,1]
slice = myList[[1:2]][1,1] #prints error: "incorrect number of dimensions"
The final line of the above code throws the error "incorrect number of dimensions."
For reference, here's a working (but less elegant) implementation of what I'm trying to do:
#assume myList has already been created (see the code snippet above)
slice = c()
for(x in 1:2) {
slice = c(slice, myList[[x]][1,1])
}
#this works. slice = [1 2]
Does anyone know how to do the above operation in one line?
Note that my "list of matrices" could be replaced with something else. If someone can suggest an alternative "collection of matrices" data structure that allows me to perform the above operation, then this will be solved.
Perhaps this question is silly...I really would like to have a clean one-line implementation though.
Two things. First, the difference between [ and [[. The relevant sentence from ?'[':
The most important distinction between [, [[ and $ is that the [ can
select more than one element whereas the other two select a single
element.
So you probably want to do myList[1:2]. Second, you can't combine subsetting operations in the way you describe. Once you do myList[1:2] you will get a list of two matrices. A list typically has only one dimension, so doing myList[1:2][1,1] is nonsensical in your case. (See comments for exceptions.)
You might try lapply instead: lapply(myList,'[',1,1).
If your matrices will all have same dimension, you could store them in a 3-dimensional array. That would certainly make indexing and extracting elements easier ...
## One way to get your data into an array
a <- array(c(myList[[1]], myList[[2]]), dim=c(3,3,2))
## Extract the slice containing the upper left element of each matrix
a[1,1,]
# [1] 1 2
This works:
> sapply(myList,"[",1,1)
[1] 1 2
edit: oh, sorry, I see almost the same idea toward the end of an earlier answer. But sapply probably comes closer to what you want, anyway

Resources