I cannot convert my list to a lower triangular matrix - r

I have two (False and True) lower triangular matrices. I would like to convert all the true values by my chosen number. I wrote the code but it does not work. Yhe given example is very similar to my problem (very complicated).
Here is my code:
> Matrix
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE FALSE FALSE FALSE FALSE
[2,] TRUE FALSE FALSE FALSE FALSE
[3,] TRUE TRUE FALSE FALSE FALSE
[4,] TRUE TRUE TRUE FALSE FALSE
[5,] TRUE TRUE TRUE TRUE FALSE
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE FALSE FALSE FALSE FALSE
[2,] TRUE FALSE FALSE FALSE FALSE
[3,] TRUE TRUE FALSE FALSE FALSE
[4,] TRUE TRUE TRUE FALSE FALSE
[5,] TRUE TRUE TRUE TRUE FALSE
np1 <- 10
np2 <- 10
np <-list(np1, np2)
Here is the numbers that need to take the places of the TRUE values.
new <- list(c(1,2,3,4,5,6,7,8,9,10),c(1,2,3,4,5,6,7,8,9,10))
To do so, I wrote this code:
new1 <- lapply(1:2, function(i) matrix(0, 5, 5))
new1 <- lapply(1:2, function(i){new1[[i]][Matrix[[i]]] <- new[[i]][1:np[[i]]]})

Try the following.
new1 <- lapply(1:2, function(i) matrix(0, 5, 5))
new1 <- lapply(1:2, function(i){new1[[i]][Matrix[[i]]][] <- 1:np[[i]]; new1[[i]]})
new1
You missed the empty [] that forces R to keep the dimensions of the object, in this case the 5x5 matrices new1. Also, the anonymous function in the second lapply needs to return something, so maybe the following exact equivalent is more readable.
new1 <- lapply(1:2, function(i){
new1[[i]][Matrix[[i]]][] <- 1:np[[i]] # note the last []
new1[[i]]
})

Related

How to merge logical vectors into a new column

What I have are many columns of logical vectors, and would like to be able to merge 2 or more columns into one, and if there is any TRUE in the row to only get that a TRUE in the merged column.
Here is an example of 2 columns and the various combinations
X <- c(T,F,T,F,F,T,F,T,T,F,F,F)
Y <- matrix(X,nrow = 6, ncol = 2)
Y
[,1] [,2]
[1,] TRUE FALSE
[2,] FALSE TRUE
[3,] TRUE TRUE
[4,] FALSE FALSE
[5,] FALSE FALSE
[6,] TRUE FALSE
How to create a 3rd column "adding" the true and leaving behind if both say False, and would this also work if there were 3 or more columns to be added?
If you have logical vectors in all the columns, you can use rowSums
cbind(Y, rowSums(Y) > 0)
# [,1] [,2] [,3]
#[1,] TRUE FALSE TRUE
#[2,] FALSE TRUE TRUE
#[3,] TRUE TRUE TRUE
#[4,] FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE
#[6,] TRUE FALSE TRUE
This will return TRUE if there is at least 1 TRUE in any of the row and FALSE otherwise. This would also work for any number of columns.
Use the below code based on Base R
X <- c(T,F,T,F,F,T,F,T,T,F,F,F)
Y <- as.data.frame(matrix(X,nrow = 6, ncol = 2))
unique(Y$V1)
Y$condition <- ifelse(Y$V1 == "TRUE" | Y$V2 == "TRUE","TRUE","FALSE")
Here is a possible solution using apply() and logical operator | that will work for any number of columns of Y.
result = cbind(Y, apply(Y, 1, FUN = function (x) Reduce(f="|", x)))
result
# [,1] [,2] [,3]
# [1,] TRUE FALSE TRUE
# [2,] FALSE TRUE TRUE
# [3,] TRUE TRUE TRUE
# [4,] FALSE FALSE FALSE
# [5,] FALSE FALSE FALSE
# [6,] TRUE FALSE TRUE

How to add matrices generated through manipulations on three lists in R?

I currently have three lists List.1, List.2, and List.3 which each contains 500 matrices, each of which has dimensions 100 x 100. Hence, List.1[[1]] is a matrix of dimensions 100 x 100.
The manipulation I would like to do is to see which of the elements for a given matrix in List.2 is between the corresponding matrix and elements in List.1 and List.3. The manipulation is the following for one matrix in the 3 lists:
+(List.2[[1]] < List.3[[1]] & List.2[[1]] > List.1[[1]])
which returns a matrix of 1's and 0's, with 1 for an entry being if the condition above was satisfied and 0 if it wasn't.
I would like to then do this over all 500 matrices in the list, without having to resort to loops. Is there a way to do this with the Reduce or lapply function, or both?
So far what I have is:
zero.one.mat <- List.1[[1]]-List.1[[1]] # Create empty zero matrix
for(i in 1:500){
zero.one.mat <- zero.one.mat + +(List.2[[i]] < List.3[[i]] & List.2[[i]] > List.1[[i]])
}
which obviously isn't the most ideal way to do it. Any thought would be appreciated. Thanks!
list.1 <- list()
list.2 <- list()
list.3 <- list()
N=5
list.1[[1]] <- matrix(1,nrow=N, ncol=N)
list.2[[1]] <- matrix(2,nrow=N, ncol=N)
list.3[[1]] <- matrix(3,nrow=N, ncol=N)
list.1[[2]] <- matrix(-1,nrow=N, ncol=N)
list.2[[2]] <- matrix(-2,nrow=N, ncol=N)
list.3[[2]] <- matrix(-3,nrow=N, ncol=N)
list.1[[3]] <- matrix(rnorm(N*N),nrow=N, ncol=N)
list.2[[3]] <- matrix(rnorm(N*N),nrow=N, ncol=N)
list.3[[3]] <- matrix(rnorm(N*N),nrow=N, ncol=N)
list.result <- lapply(1:length(list.1), FUN=function(i){list.2[[i]] < list.3[[i]] & list.2[[i]] > list.1[[i]]})
# [[1]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] TRUE TRUE TRUE TRUE TRUE
# [2,] TRUE TRUE TRUE TRUE TRUE
# [3,] TRUE TRUE TRUE TRUE TRUE
# [4,] TRUE TRUE TRUE TRUE TRUE
# [5,] TRUE TRUE TRUE TRUE TRUE
#
# [[2]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] FALSE FALSE FALSE FALSE FALSE
# [2,] FALSE FALSE FALSE FALSE FALSE
# [3,] FALSE FALSE FALSE FALSE FALSE
# [4,] FALSE FALSE FALSE FALSE FALSE
# [5,] FALSE FALSE FALSE FALSE FALSE
#
# [[3]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] FALSE FALSE FALSE TRUE FALSE
# [2,] FALSE FALSE FALSE FALSE FALSE
# [3,] FALSE FALSE FALSE FALSE FALSE
# [4,] FALSE TRUE FALSE FALSE FALSE
# [5,] FALSE FALSE FALSE TRUE TRUE
# If you need to fund the sum of all of them, then you can add Reduce:
Reduce("+",lapply(1:length(list.1),
FUN=function(i){
list.2[[i]] < list.3[[i]] &
list.2[[i]] > list.1[[i]]}))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 2 1
# [2,] 1 1 1 1 1
# [3,] 1 1 1 1 1
# [4,] 1 2 1 1 1
# [5,] 1 1 1 2 2

Matrix comparing each element in vector1 to each element in vector2

I want to compare each element in one vector (D) to each element in another vector (E) such that I get a matrix with dimensions length(D)xlength(E).
The comparison in question is of the form:
abs(D[i]-E[j])<0.1
So for
D <- c(1:5)
E <- c(2:6)
I want to get
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE TRUE FALSE FALSE FALSE
[2,] FALSE FALSE TRUE FALSE FALSE
[3,] FALSE FALSE FALSE TRUE FALSE
[4,] FALSE FALSE FALSE FALSE TRUE
[5,] FALSE FALSE FALSE FALSE FALSE
(Or 1s and 0s to the same effect)
I have been able to get that output by doing something clunky like:
rbind(D%in%E[1],D%in%E[2],D%in%E[3],D%in%E[4],D%in%E[5])
and I could write a loop for 1:length(E), but surely there is a simple name and simple code for this operation? I have been struggling to find the language to search for an answer to this question.
You can use outer to perform the calculation in a vectorized manner across all pairs of elements in D and E:
outer(E, D, function(x, y) abs(x-y) <= 0.1)
# [,1] [,2] [,3] [,4] [,5]
# [1,] FALSE TRUE FALSE FALSE FALSE
# [2,] FALSE FALSE TRUE FALSE FALSE
# [3,] FALSE FALSE FALSE TRUE FALSE
# [4,] FALSE FALSE FALSE FALSE TRUE
# [5,] FALSE FALSE FALSE FALSE FALSE
I see two benefits over the sort of approach you've included in your question:
It is less typing
It is more efficient: the function is called just once with every single pair of x and y values, so it should be quicker than comparing E[1] against every element of D, then E[2], and so on.
Actually a direct approach would be (thanks to #alexis_laz):
n = length(E)
abs(E - matrix(D, ncol=n, nrow=n, byrow=T))<0.1
# [,1] [,2] [,3] [,4] [,5]
#[1,] FALSE TRUE FALSE FALSE FALSE
#[2,] FALSE FALSE TRUE FALSE FALSE
#[3,] FALSE FALSE FALSE TRUE FALSE
#[4,] FALSE FALSE FALSE FALSE TRUE
#[5,] FALSE FALSE FALSE FALSE FALSE

retrieve specific entries of a matrix based on values from a data frame

I have a data frame of the form:
my.df = data.frame(ID=c(1,2,3,4,5,6,7), STRAND=c('+','+','+','-','+','-','+'), COLLAPSE=c(0,0,1,0,1,0,0))
and another matrix of dimensions nrow(mydf) by nrow(my.df). It is a correlation matrix, but that's not important for the discussion.
For example:
mat = matrix(rnorm(n=nrow(my.df)*nrow(my.df),mean=1,sd=1), nrow = nrow(my.df), ncol=nrow(my.df))
The question is how to retrieve only the upper triangle elements from matrix mat, such that my.df have values of COLLAPSE == 0, and are of the of the same strand?
In this specific example, I'd interested in retrieving the following entries from matrix mat in a vector:
mat[1,2]
mat[1,7]
mat[2,7]
mat[4,6]
The logic is as follows, 1,2 are both of the same strand, and it's collapse value is equal to zero so should be retrieved, 3 would never be combined with any other row because it has collapse value = 1, 1,3 are of the same strand and have collapse value = 0 so should also be retrieved,...
I could write a for loop but I am looking for a more crantastic way to achieve such results...
Here's one way to do it using outer:
First, find indices with identical STRAND values and where COLLAPSE == 0:
idx <- with(my.df, outer(STRAND, STRAND, "==") &
outer(COLLAPSE, COLLAPSE, Vectorize(function(x, y) !any(x, y))))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] FALSE TRUE FALSE FALSE FALSE FALSE TRUE
# [2,] TRUE FALSE FALSE FALSE FALSE FALSE TRUE
# [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [4,] FALSE FALSE FALSE FALSE FALSE TRUE FALSE
# [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [6,] FALSE FALSE FALSE TRUE FALSE FALSE FALSE
# [7,] TRUE TRUE FALSE FALSE FALSE FALSE FALSE
Second, set values in lower triangle and on the diagonal to FALSE. Create a numeric index:
idx2 <- which(idx & upper.tri(idx), arr.ind = TRUE)
# row col
# [1,] 1 2
# [2,] 4 6
# [3,] 1 7
# [4,] 2 7
Extract values:
mat[idx2]
# [1] 1.72165093 0.05645659 0.74163428 3.83420241
Here's one way to do it.
# select only the 0 collapse records
sel <- my.df$COLLAPSE==0
# split the data frame by strand
groups <- split(my.df$ID[sel], my.df$STRAND[sel])
# generate all possible pairs of IDs within the same strand
pairs <- lapply(groups, combn, 2)
# subset the entries from the matrix
lapply(pairs, function(ij) mat[t(ij)])
df <- my.df[my.df$COLLAPSE == 0, ]
strand <- c("+", "-")
idx <- do.call(rbind, lapply(strand, function(strand){
t(combn(x = df$ID[df$STRAND == strand], m = 2))
}))
idx
# [,1] [,2]
# [1,] 1 2
# [2,] 1 7
# [3,] 2 7
# [4,] 4 6
mat[idx]

Vectorize comparison of values in dataframe

I am trying to compare the value of a parameter in each row of a dataframe with the value of the same parameter of all other rows. The result is a matrix that that is TRUE/FALSE on the intersection of each row with each row. It is pretty simple to implement this in a loop-based manner, but takes too much processing time with a large dataframe. I am blanking on a way to "vectorize" this code (use apply?) and speed up the processing code. Many thanks in advance.
The code that i use so far;
#dim matrix
adjm<- matrix(0,nrow=nrow(df),ncol=nrow(df))
#score
for(i in 1:nrow(df)){
for(t in 1:nrow(df)){
adjm[t,i]=df$varA[i]==df$varA[t]
}
}
You can use outer to vectorize your code
outer(df$varA, df$varA, "==")
For example
df <- data.frame(varA = c(1, 2, 1, 3, 4, 2))
outer(df$varA, df$varA, "==")
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] TRUE FALSE TRUE FALSE FALSE FALSE
## [2,] FALSE TRUE FALSE FALSE FALSE TRUE
## [3,] TRUE FALSE TRUE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE TRUE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE TRUE FALSE
## [6,] FALSE TRUE FALSE FALSE FALSE TRUE
With apply:
apply(df,1,function(x) x[1] == df$varA) # `1` should be column number for `varA`
But that's not technically vectorized.

Resources