How to store the result of a loop over combinatoric pairs of a list? - r

I have a matrix (but for the purposes of the example I will simplify to a vector).
I want to loop over all pairs of the list. So if the list is length n (or the matrix has n columns), the resulting list has to be (n choose 2) items long.
Suppose n = 6 for the example, but in reality is 36.
Basically, I want a loop like this:
list=1:6
endlist= vector("list", 15) # 15 from 6!/((4!)(2!))
Here is what I want:
Note the below loop does NOT work since there is no i index, and there appears to be no linear combination of j and k that fits the index. Is there a nonlinear one? Or is there a better way to program this?
for(j in 1:5){
for(k in (j+1):6){
endlist[[i]]=list[j]*list[k]
}
}
Giving the output:
endlist=
[[1]]
[1] 2 3 4 5 6
[[2]]
[1] 6 8 10 12
etc.

There's definitely a better way to code that. I'm not sure how this will necessarily apply to your matrix, but for your example:
combn(list, 2, prod)
#[1] 2 3 4 5 6 6 8 10 12 12 15 18 20 24 30
combn() produces combinations of a vector, and can apply a function to each combination(prod). If you really want the output as a list, you can do it with split():
split(combn(list, 2, prod), rep(1:(max(list)-1), times =(max(list)-1):1))
# $`1`
# [1] 2 3 4 5 6
#
# $`2`
# [1] 6 8 10 12
#
# $`3`
# [1] 12 15 18
#
# $`4`
# [1] 20 24
#
# $`5`
# [1] 30
I think the takeaway here is that it's better to calculate your combinations, and work on those, rather than create the combinations yourself in some kind of loop.

Related

R: Divide columns into various subcolumns at specific chosen points / values

I know this might be simple, however, I searched and couldn't find a clear answer, and as non-experienced user of r, I couldn't develop it myself.
I simply need to divide a column in a list or data frame into several sub-columns (not necessarily of equal lengths) at certain defined points of specific order or value. I'm dealing with large data so, so there must be a fast function to directly divide the column according to the chosesn points.
To make it clear, I need to make something like:
# data frame
df<- data.frame(cbind("l1"=c(1:20),"l2"=c(21:40)))
# sepration points
pts<- c(4, 11, 17)
# dividing into sub columns
gp1<-df$l1[1:pts[1]]
gp2<-df$l1[pts[1]:pts[2]]
gp3<-df$l1[pts[2]:pts[3]]
gp4<-df$l1[pts[3]:20]
# combining
res<- list(gp1, gp2, gp3, gp4)
> res
[[1]]
[1] 1 2 3 4
[[2]]
[1] 4 5 6 7 8 9 10 11
[[3]]
[1] 11 12 13 14 15 16 17
[[4]]
[1] 17 18 19 20
But without defining the separation points one by one, and without reordering the data on a value basis.
Thanks in advance for your help!
We can use Map to create the sequence. Concatenate 1 before the 'pts' and nrow at the end of the 'pts' as two separate vectors, use that to create sequence of index with Map and get the corresponding values of 'l1' column of 'df' based on the sequence
Map(function(i, j) df$l1[i:j], c(1, pts), c(pts, nrow(df)))
#[[1]]
#[1] 1 2 3 4
#[[2]]
#[1] 4 5 6 7 8 9 10 11
#[[3]]
#[1] 11 12 13 14 15 16 17
#[[4]]
#[1] 17 18 19 20

How can I remove shared values from a list of vectors

I have a list :
x <- list("a" = c(1:6,32,24) , "b" = c(1:4,8,10,12,13,17,24),
"F" = c(1:5,9:15,17,18,19,20,32))
x
$a
[1] 1 2 3 4 5 6 32 24
$b
[1] 1 2 3 4 8 10 12 13 17,24
$F
[1] 1 2 3 4 5 9 10 11 12 13 14 15 17 18 19 20 32
Each vector in the list shares a number of elements with others. How I can remove shared values to get the following result?
$a
[1] 1 2 3 4 5 6 32 24
$b
[1] 8 10 12 13 17
$F
[1] 9 11 14 15 18 19 20
As you can see: the first vector does not change. The shared elements between first and second vectors will be removed from the second vector, and then we will remove the shared elements from third vectors after comparing it with first and second vectors. The target of this task is clustering dataset (the original data set contains 590 objects).
You can use Reduce and setdiff on the list in the reverse order to find all elements of the last vector that do not appear in the others. Bung this into an lapply to run over partial sub-lists to get your desired output:
lapply(seq_along(x), function(y) Reduce(setdiff,rev(x[seq(y)])))
[[1]]
[1] 1 2 3 4 5 6 32 24
[[2]]
[1] 8 10 12 13 17
[[3]]
[1] 9 11 14 15 18 19 20
When scaling up, the number of rev calls may become an issue, so you might want to reverse the list once, outside the lapply as a new variable, and subset that within it.
x <- list("a" = c(1:6,32,24) ,
"b" = c(1:4,8,10,12,13,17,24),
"F" = c(1:5,9:15,17,18,19,20,32))
This is inefficient since it re-makes the union
of the previous set of lists at each step (rather than
keeping a running total), but it was the
first way I thought of.
for (i in 2:length(x)) {
## construct union of all previous lists
prev <- Reduce(union,x[1:(i-1)])
## remove shared elements from the current list
x[[i]] <- setdiff(x[[i]],prev)
}
You could probably improve this by initializing prev as numeric(0) and making prev into c(prev,x[i-1]) at each step (although this grows a vector at each step, which is a slow operation). If you don't have a gigantic data set/don't have to do this operation millions of times it's probably good enough.

R break up data frame into list using vector of number of rows

I have a data.frame that I want to break up into a list of data.frames using a vector that will tell me how many rows should be in each consecutive list element.
Sample Data
vectornom <- c(1,2,4,3)
df <- data.frame(x=1:10,y=11:20)
Desired result
> new_list
[[1]]
x y
1 11
[[2]]
x y
2 12
3 13
[[3]]
x y
4 14
5 15
6 16
7 17
[[4]]
x y
8 18
9 19
10 20
I appreciate your help
You can use the (pretty awesome) split function for this, using vectornom to create the index on which to "split"
split(df, rep(1:length(vectornom), vectornom))

Recursive looping in r

I am new in R but I want to loop through elements of a given list recursively, to be presice I have alist of vectors where the first vector is given by (1,2,3,4), then I now want to loop through this vector and append the second vector obtained to the original list, again loop thorugh second vector in the list and get the third vector which is also appended on the original list and so on. I have this code to start with`
occlist <- list()
occ_cell <- c(1,2,3,4)
for(i in occ_cell){
occ_cell <- seq(i,4*i, by = 1)
occlist[[i]] <- occ_cell
}
`
gives the following list
#[[1]]
#[1] 1 2 3 4
#[[2]]
#[1] 2 3 4 5 6 7 8
#[[3]]
# [1] 3 4 5 6 7 8 9 10 11 12
#[[4]]
# [1] 4 5 6 7 8 9 10 11 12 13 14 15 16
I think to be more clear, lets have the following figure
recOcc <- function(i) {
if (i == 0) return ( NULL )
append( recOcc(i-1), list(seq(i, 4*i)) )
}
And, call with (to reproduce your output)
recOcc(4)
# [[1]]
# [1] 1 2 3 4
#
# [[2]]
# [1] 2 3 4 5 6 7 8
#
# [[3]]
# [1] 3 4 5 6 7 8 9 10 11 12
#
# [[4]]
# [1] 4 5 6 7 8 9 10 11 12 13 14 15 16
You can also use Recall to name your recursive function in the recursive call, which allows for the function name to change.
Edit
For the tree structure, you could try this
## i is the number to start the sequence
## depth determines how deep to recurse
recOcc2 <- function(i, depth=3, cur.depth=0) {
if (depth==cur.depth) return(seq(i, 4*i))
acc <- as.list(seq(i, 4*i))
for (ii in seq_along(acc))
acc[[ii]] <- recOcc2(acc[[ii]], depth, cur.depth+1)
acc
}
## To recreate the simple list
res <- recOcc2(1, depth=1)
## For nested lists
res <- recOcc2(1, depth=2)

Loop through a vector of vectors

When I loop through a vector of vectors, the result of each loop is several vectors. I would expect the result of each loop to be a vector. Please see the following example:
foo <- seq(from=1, to=5, by=1)
bar <- seq(from=6, to=10, by=1)
baz <- seq(from=11, to=15, by=1)
vects <- c(foo,bar,baz)
for(v in vects) {print(v)}
# [1] 1
# [1] 2
# [1] 3
# [1] 4
# [1] 5
# [1] 6
# [1] 7
# [1] 8
# [1] 9
# [1] 10
# [1] 11
# [1] 12
# [1] 13
# [1] 14
# [1] 15
This is odd as I would expect three vectors given it (should) iterate three times given the vector, c(foo,bar,baz). Something like:
# [1] 1 2 3 4 5
# [1] 6 7 8 9 10
# [1] 11 12 13 14 15
Can anyone explain why I am getting this result (15 vectors) and how to achieve the result I am looking for (3 vectors)?
Look at what vects is:
> vects
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The c() joins (in this case) the three vectors, concatenating them into a single vector. In the for() loop, v takes on each values in vects in turn and prints it, hence the result you see.
Did you want a list of the three separate vectors? If so
> vects2 <- list(foo, bar, baz)
> for(v in vects2) {print(v)}
[1] 1 2 3 4 5
[1] 6 7 8 9 10
[1] 11 12 13 14 15
In other words, form a list of the vectors, not a combination of the vectors.
Substitute vects <- list(foo,bar,baz) for vects <- c(foo,bar,baz).
There is no such thing (really) as a vector of vectors.

Resources