I have a list with 535 elements where each of these elements is a 1575x1575 matrix.
Some of the rows and columns are however entirely NAs.
I want to remove these rows and columns and already wrote a line which works when I just apply it for one entry.
But I can't figure out how to apply this apply function for the whole list. covmatrix is my list in this example.
testf <- function(i){
covmatrix[[i]][apply(!is.na(covmatrix[[i]]),2,any),apply(!is.na(covmatrix[[i]]),2,any)]
}
newlist <- lapply(covmatrix, testf)
I get the error code: Error in covmatrix[[i]] : no such Index at Level 1
I guess I do not understand properly how lapply works.
Lets' take the following toy example data:
matlist <- lapply(1:3, function(x) matrix(1:9, ncol = 3))
matlist[[2]][1,] <- NA
matlist[[3]][,1] <- NA
matlist
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] NA NA NA
#> [2,] 2 5 8
#> [3,] 3 6 9
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] NA 4 7
#> [2,] NA 5 8
#> [3,] NA 6 9
It makes coding a lot easier if we break down the problem into little chunks. For a complex problem, clarity of code is more important than brevity.
First we need a function that will return FALSE if all elements of a vector are NA, and TRUE otherwise:
notallNA <- function(vector) !all(is.na(vector))
Now we write a second function that uses our first function to remove rows and columns that consist purely of NAs from a matrix:
remove_NA <- function(mat) {
valid_rows <- apply(mat, 1, notallNA)
valid_cols <- apply(mat, 2, notallNA)
return(mat[valid_rows, valid_cols])
}
Finally, we can lapply this function to our list of matrices:
lapply(matlist, remove_NA)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 5 8
#> [2,] 3 6 9
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 5 8
#> [3,] 6 9
Note that, although we could squash these two functions into one or two lines of code, and do the whole thing as a lambda inside an lapply, the above code is simpler and easier to read / maintain than:
lapply(matlist, function(x) x[apply(x, 1, function(y) !all(is.na(y))),
apply(x, 2, function(y) !all(is.na(y)))])
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 5 8
#> [2,] 3 6 9
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 5 8
#> [3,] 6 9
Assume that your list of matrices looks like this
set.seed(100)
ls_of_mat <- replicate(5, matrix(sample(c(NA, 1:10), size = 36, T, c(.7, rep(.3 / 10, 10))), 6), F)
[[1]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] NA 5 NA NA NA NA
[2,] NA NA NA NA NA 9
[3,] NA NA 4 NA 4 NA
[4,] NA NA NA 2 8 10
[5,] NA NA NA NA NA NA
[6,] NA 8 NA 7 NA 8
[[2]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] NA 4 NA NA NA NA
[2,] NA 6 NA NA NA NA
[3,] 1 NA NA NA 10 NA
[4,] NA NA NA NA NA NA
[5,] NA 4 NA NA NA NA
[6,] 3 8 NA NA NA NA
[[3]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] NA 6 NA 8 NA NA
[2,] 10 NA NA NA NA NA
[3,] NA NA 7 NA NA NA
[4,] NA NA NA NA 4 NA
[5,] 3 9 NA 8 NA 1
[6,] 4 1 7 NA NA 2
Your logic simplifies to
# 1. find non-NA elements
# 2. drop rows and cols with less than one (zero) non-NA element
lapply(ls_of_mat, function(x) {
is_value <- !is.na(x)
x[!rowSums(is_value) < 1L, !colSums(is_value) < 1L]
})
Output
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 5 NA NA NA NA
[2,] NA NA NA NA 9
[3,] NA 4 NA 4 NA
[4,] NA NA 2 8 10
[5,] 8 NA 7 NA 8
[[2]]
[,1] [,2] [,3]
[1,] NA 4 NA
[2,] NA 6 NA
[3,] 1 NA 10
[4,] NA 4 NA
[5,] 3 8 NA
[[3]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] NA 6 NA 8 NA NA
[2,] 10 NA NA NA NA NA
[3,] NA NA 7 NA NA NA
[4,] NA NA NA NA 4 NA
[5,] 3 9 NA 8 NA 1
[6,] 4 1 7 NA NA 2
Related
cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])
If I have 2 square Matrices with random NA values, for example:
Matrix A:
1 2 3
1 5 NA 7
2 NA 3 8
3 NA 4 5
Matrix B:
1 2 3
1 NA 8 NA
2 2 5 9
3 NA 4 3
What is the best way to multiply them? Would changing NA values to 0 give a different result of the dot product?
NAs will be ignored:
## Dummy matrices
mat1 <- matrix(sample(1:9, 9), 3, 3)
mat2 <- matrix(sample(1:9, 9), 3, 3)
## Adding NAs
mat1[sample(1:9, 4)] <- NA
mat2[sample(1:9, 4)] <- NA
mat1
# [,1] [,2] [,3]
#[1,] 9 NA 3
#[2,] 2 NA NA
#[3,] NA 1 8
mat2
# [,1] [,2] [,3]
#[1,] NA NA 4
#[2,] NA 9 3
#[3,] NA 7 1
mat1 * mat2
# [,1] [,2] [,3]
#[1,] NA NA 12
#[2,] NA NA NA
#[3,] NA 7 8
mat1 %*% mat2
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] NA NA NA
#[3,] NA NA NA
In this case the dot product results in only NAs because there are no operations that does not involve an NA. Different matrices can lead to different results.
Apologies, if the question is too basic. What would be an effective approach/idea (in R) to convert
list(c(1), c(1,2), c(1,2,3), c(1,2,3,4))
to square matrix form
[,1] [,2] [,3] [,4]
[1,] 1 NA NA NA
[2,] 1 2 NA NA
[3,] 1 2 3 NA
[4,] 1 2 3 4
I suppose there is some quick dynamic way to append just the right number of NA values and then convert to a matrix.
Naturally, the size of the (square) matrix can change).
Thanks in advance for your time.
You can use
## create the list
x <- Map(":", 1, 1:4)
ml <- max(lengths(x))
do.call(rbind, lapply(x, "length<-", ml))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
Or you could do
library(data.table)
as.matrix(unname(rbindlist(lapply(x, as.data.frame.list), fill = TRUE)))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
And one more for good measure ... Fore!
m <- stringi::stri_list2matrix(x, byrow = TRUE)
mode(m) <- "numeric"
m
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
I have a matrix with some NA values
for example:
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
I want to create new matrix with data from my matrix above with new dimension and no NA value. (it is ok to have NA only some last elements)
something like:
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 NA
[4,] 4 10 NA
I would appreciate if anyone can help me.
Thanks
Something like this as well:
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
matrix(c(na.omit(c(m)), rep(NA, sum(is.na(m)))), nrow=4)
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# create an array of the appropriate class and dimension (filled with NA values)
dims <- c(4, 3)
md <- array(m[0], dim=dims)
# replace first "n" values with non-NA values from m
nonNAm <- na.omit(c(m))
md[seq_along(nonNAm)] <- nonNAm
md
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
Yet another attempt. This will keep the order of the values in column order as a matrix usually would. E.g.:
mat <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 7 11 NA
Now change a value to check it doesn't affect the ordering.
mat[7] <- 20
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 20 11 NA
You can then specify whatever dimensions you feel like to the dim= argument:
array(mat[order(is.na(mat))],dim=c(4,3))
# [,1] [,2] [,3]
#[1,] 1 6 11
#[2,] 2 20 12
#[3,] 3 8 NA
#[4,] 4 10 NA
This is fairly straightforward if you want to preserve order column-wise or row-wise.
originalMatrix <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
newMatrixNums <- originalMatrix[!is.na(originalMatrix)]
[1] 1 2 3 4 6 7 8 10 11 12
Pad with NA:
newMatrixNums2 <- c(newMatrixNums,rep(NA,2))
Column-wise:
matrix(newMatrixNums2,nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 8 12
[2,] 2 6 10 NA
[3,] 3 7 11 NA
Row-wise:
matrix(newMatrixNums2,nrow=3,byrow=T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 6 7 8 10
[3,] 11 12 NA NA
Here's one way:
# Reproducing your data
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# Your desired dimensions
dims <- c(4, 3)
array(c(na.omit(c(m)), rep(NA, prod(dims) - length(na.omit(c(m))))), dim=dims)
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
This can do the job but dunno whether it is a good way.
list1 <- m[m]
list2 <- m[!is.na(m)]
element1 <- list2
element2 <- rep(NA, (length(list1)-length(list2)))
newm <- matrix(c(element1,element2), nrow=4)
If you increase the length of a numeric vector with length(x)<- without assigning values to the new elements, the new values are given NA as their value. So length(M2) <- length(M) takes the shorter M2 vector and makes it the same length as M by adding NA values to the new elements.
## original
> (M <- matrix(c(1:4,NA,6:8,NA,10:12), nrow = 3))
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 NA 8 11
# [3,] 3 6 NA 12
## new
> M2 <- M[!is.na(M)]; length(M2) <- length(M)
> matrix(M2, ncol(M))
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
I have a list like below, which is a list of lists containing matrices(so "ftable" is a list of ten lists and each of the internal lists contains seven matrices). I need to calculate the mean of associated matrices which may also have NA values as well.I have tried several ways but I got errors.
for(i in 1:10){
for(j in 1:7){
ftable[[i]][[j]] <- matrix (x,nrow=8,ncol=8, byrow=TRUE)
}
}
> str(ftable)
List of 10
$ :List of 7
.......
.......
as the result I need to have a list containing seven matrices that each of these matrices are the result of applying mean to ftable[[1]][[i]], ftable[[2]][[i]], ... , ftable[[10]][[i]] and i in 1:7.
I have tried this but I got error:
meanTable <- list()
for (i in 1:7)
meanTable[[i]] <- matrix (0, nrow=8,ncol=8)
> meanTable <- lapply(1:7, function(i) Reduce(mean, list(ftable[[1]][i],ftable[[2]][i],ftable[[3]][i],ftable[[4]][i],ftable[[5]][i],ftable[[6]][i],ftable[[7]][i],ftable[[8]][i],ftable[[9]][i],ftable[[10]][i])))
Error in mean.default(init, x[[i]]) :
'trim' must be numeric of length one
In addition: Warning message:
In mean.default(init, x[[i]]) :
argument is not numeric or logical: returning NA
one example of the matrices:
> ftable[[1]][[1]]
1 2 3 4 5 6 7 8
1 NA 0.924 0.835 -0.336 0.335 -0.948 0.285 0.749
2 NA NA 0.772 -0.333 0.333 -0.892 0.127 0.715
3 NA NA NA -0.476 0.475 -0.756 0.258 0.749
4 NA NA NA NA -0.999 0.399 -0.150 -0.399
5 NA NA NA NA NA -0.399 0.151 0.399
6 NA NA NA NA NA NA -0.134 -0.715
7 NA NA NA NA NA NA NA 0.144
8 NA NA NA NA NA NA NA NA
I think this is what you require. The easiest way would be to unlist your outer list and then apply Reduce as follows:
I'll create a variation of the input from user1317221_G
set.seed(45)
mat1 <- matrix(c(sample(10),NA,NA),nrow=2)
matlist1 <- list(mat1,mat1,mat1)
mat2 <- matrix(c(sample(11:20),NA,NA),nrow=2)
matlist2 <- list(mat2,mat2,mat2)
bigmatlist <- list(matlist1,matlist2)
> bigmatlist
# [[1]]
# [[1]][[1]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 7 2 10 1 4 NA
# [2,] 3 9 8 5 6 NA
#
# [[1]][[2]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 7 2 10 1 4 NA
# [2,] 3 9 8 5 6 NA
#
# [[1]][[3]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 7 2 10 1 4 NA
# [2,] 3 9 8 5 6 NA
#
#
# [[2]]
# [[2]][[1]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 14 13 19 17 15 NA
# [2,] 18 20 16 11 12 NA
#
# [[2]][[2]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 14 13 19 17 15 NA
# [2,] 18 20 16 11 12 NA
#
# [[2]][[3]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 14 13 19 17 15 NA
# [2,] 18 20 16 11 12 NA
Now the solution.
# in your case, outer.len = 10 and inner.len = 7
outer.len <- 2
inner.len <- 3
prod.len <- outer.len * inner.len
list.un <- unlist(bigmatlist, recursive = FALSE)
o <- lapply(1:inner.len, function(idx) {
Reduce('+', list.un[seq(idx, prod.len, by = inner.len)])/outer.len
})
> o
# [[1]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 10.5 7.5 14.5 9 9.5 NA
# [2,] 10.5 14.5 12.0 8 9.0 NA
#
# [[2]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 10.5 7.5 14.5 9 9.5 NA
# [2,] 10.5 14.5 12.0 8 9.0 NA
#
# [[3]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 10.5 7.5 14.5 9 9.5 NA
# [2,] 10.5 14.5 12.0 8 9.0 NA
you mean like this:
mat1 <- matrix(c(1:10,NA,NA),nrow=2)
matlist1 <- list(mat1,mat1,mat1)
bigmatlist <- list(matlist1,matlist1)
mean(mat1, na.rm=TRUE)
#[1] 5.5
sapply(matlist1, function(x) mean(x,na.rm=TRUE))
#[1] 5.5 5.5 5.5
and a list of lists:
sapply(bigmatlist,function(x) sapply(x, function(x) mean(x,na.rm=TRUE)) )
# [,1] [,2]
#[1,] 5.5 5.5
#[2,] 5.5 5.5
#[3,] 5.5 5.5
changing the sapply for lapply where appropriate if you want lists returned.
where [3,][,1] is the mean for matrix three in list 1 i.e. bigmatlist[[1]][[3]]