I have a matrix with some NA values
for example:
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
I want to create new matrix with data from my matrix above with new dimension and no NA value. (it is ok to have NA only some last elements)
something like:
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 NA
[4,] 4 10 NA
I would appreciate if anyone can help me.
Thanks
Something like this as well:
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
matrix(c(na.omit(c(m)), rep(NA, sum(is.na(m)))), nrow=4)
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# create an array of the appropriate class and dimension (filled with NA values)
dims <- c(4, 3)
md <- array(m[0], dim=dims)
# replace first "n" values with non-NA values from m
nonNAm <- na.omit(c(m))
md[seq_along(nonNAm)] <- nonNAm
md
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
Yet another attempt. This will keep the order of the values in column order as a matrix usually would. E.g.:
mat <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 7 11 NA
Now change a value to check it doesn't affect the ordering.
mat[7] <- 20
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 20 11 NA
You can then specify whatever dimensions you feel like to the dim= argument:
array(mat[order(is.na(mat))],dim=c(4,3))
# [,1] [,2] [,3]
#[1,] 1 6 11
#[2,] 2 20 12
#[3,] 3 8 NA
#[4,] 4 10 NA
This is fairly straightforward if you want to preserve order column-wise or row-wise.
originalMatrix <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
newMatrixNums <- originalMatrix[!is.na(originalMatrix)]
[1] 1 2 3 4 6 7 8 10 11 12
Pad with NA:
newMatrixNums2 <- c(newMatrixNums,rep(NA,2))
Column-wise:
matrix(newMatrixNums2,nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 8 12
[2,] 2 6 10 NA
[3,] 3 7 11 NA
Row-wise:
matrix(newMatrixNums2,nrow=3,byrow=T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 6 7 8 10
[3,] 11 12 NA NA
Here's one way:
# Reproducing your data
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# Your desired dimensions
dims <- c(4, 3)
array(c(na.omit(c(m)), rep(NA, prod(dims) - length(na.omit(c(m))))), dim=dims)
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
This can do the job but dunno whether it is a good way.
list1 <- m[m]
list2 <- m[!is.na(m)]
element1 <- list2
element2 <- rep(NA, (length(list1)-length(list2)))
newm <- matrix(c(element1,element2), nrow=4)
If you increase the length of a numeric vector with length(x)<- without assigning values to the new elements, the new values are given NA as their value. So length(M2) <- length(M) takes the shorter M2 vector and makes it the same length as M by adding NA values to the new elements.
## original
> (M <- matrix(c(1:4,NA,6:8,NA,10:12), nrow = 3))
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 NA 8 11
# [3,] 3 6 NA 12
## new
> M2 <- M[!is.na(M)]; length(M2) <- length(M)
> matrix(M2, ncol(M))
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
Related
I have a matrix that contains some NA elements (e.g. mat below), and I want to make a new function that prints it with the NA values hidden (i.e. as fun below). How can I achieve this?
mat <- cbind(c(1,2,NA,NA),c(3,3,3,NA),c(NA,4,4,4),c(NA,NA,5,5))
print(mat)
[,1] [,2] [,3] [,4]
[1,] 1 3 NA NA
[2,] 2 3 4 NA
[3,] NA 3 4 5
[4,] NA NA 4 5
fun(mat)
[,1] [,2] [,3] [,4]
[1,] 1 3
[2,] 2 3 4
[3,] 3 4 5
[4,] 4 5
We can use na.print in print
print(mat, na.print = "")
# [,1] [,2] [,3] [,4]
#[1,] 1 3
#[2,] 2 3 4
#[3,] 3 4 5
#[4,] 4 5
I have two matrices m1 and m2.
m1 <- matrix(1:16, ncol = 4)
m2 <- matrix(16:1, ncol = 4)
# > m1
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# > m2
# [,1] [,2] [,3] [,4]
# [1,] 16 12 8 4
# [2,] 15 11 7 3
# [3,] 14 10 6 2
# [4,] 13 9 5 1
I want to find the minimum between the two matrices for each cell within a moving kernel of 3x3. The outer margines should be ignored, i.e. they can be filled with NAs and the min function should then have na.rm = TRUE. The result should look like this:
# > m3
# [,1] [,2] [,3] [,4]
# [1,] 1 1 3 3
# [2,] 1 1 2 2
# [3,] 2 2 1 1
# [4,] 3 3 1 1
I have already tried a combination of pmin{base} and runmin{caTools} like this:
pmin(runmin(m1, 3, endrule = "keep"),
runmin(m2, 3, endrule = "keep"))
However, this did not work. Probably due to the fact that
"If x is a matrix than each column will be processed separately."
(from ?runmin)
Is there any package, that performs such operations, or is it possible to apply?
Here is a base R approach:
m = pmin(m1, m2)
grid = expand.grid(seq(nrow(m)), seq(ncol(m)))
x = apply(grid, 1, function(u) {
min(m[max(1,u[1]-1):min(nrow(m), u[1]+1), max(1,u[2]-1):min(ncol(m), u[2]+1)])
})
dim(x) = dim(m)
#> x
# [,1] [,2] [,3] [,4]
#[1,] 1 1 3 3
#[2,] 1 1 2 2
#[3,] 2 2 1 1
#[4,] 3 3 1 1
Apologies, if the question is too basic. What would be an effective approach/idea (in R) to convert
list(c(1), c(1,2), c(1,2,3), c(1,2,3,4))
to square matrix form
[,1] [,2] [,3] [,4]
[1,] 1 NA NA NA
[2,] 1 2 NA NA
[3,] 1 2 3 NA
[4,] 1 2 3 4
I suppose there is some quick dynamic way to append just the right number of NA values and then convert to a matrix.
Naturally, the size of the (square) matrix can change).
Thanks in advance for your time.
You can use
## create the list
x <- Map(":", 1, 1:4)
ml <- max(lengths(x))
do.call(rbind, lapply(x, "length<-", ml))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
Or you could do
library(data.table)
as.matrix(unname(rbindlist(lapply(x, as.data.frame.list), fill = TRUE)))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
And one more for good measure ... Fore!
m <- stringi::stri_list2matrix(x, byrow = TRUE)
mode(m) <- "numeric"
m
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
In R, let M be the matrix
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 3
[3,] 2 4 5
[4,] 6 7 8
I would like to select the submatrix m
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 2 4 5
[3,] 6 7 8
using unique on M[,1], specifying to keep the row with the maximal value in the second columnM.
At the end, the algorithm should keep row [2,] from the set \{[1,], [2,]\}. Unfortunately unique() returns me a vector with actual values, and not row numbers, after elimination of duplicates.
Is there a way to get the asnwer without the package plyr?
Thanks a lot,
Avitus
Here's how:
is.first.max <- function(x) seq_along(x) == which.max(x)
M[as.logical(ave(M[, 2], M[, 1], FUN = is.first.max)), ]
# [,1] [,2] [,3]
# [1,] 1 3 3
# [2,] 2 4 5
# [3,] 6 7 8
You're looking for duplicated.
m <- as.matrix(read.table(text="1 2 3
1 3 3
2 4 5
6 7 8"))
m <- m[order(m[,2], decreasing=TRUE), ]
m[!duplicated(m[,1]),]
# V1 V2 V3
# [1,] 6 7 8
# [2,] 2 4 5
# [3,] 1 3 3
Not the most efficient:
M <- matrix(c(1,1,2,6,2,3,4,7,3,3,5,8),4)
t(sapply(unique(M[,1]),function(i) {temp <- M[M[,1]==i,,drop=FALSE]
temp[which.max(temp[,2]),]
}))
# [,1] [,2] [,3]
#[1,] 1 3 3
#[2,] 2 4 5
#[3,] 6 7 8
I have the following matrix
2 4 1
6 32 1
4 2 1
5 3 2
4 2 2
I want to make the following two matrices based on 3rd column
first
2 4
6 32
4 2
second
5 3
4 2
Best I can come up with, but I get an error
x <- cbind(mat[,1], mat[,2]) if mat[,3]=1
y <- cbind(mat[,1], mat[,2]) if mat[,3]=2
If mat is your matrix:
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
> mat
[,1] [,2] [,3]
[1,] 1 6 1
[2,] 2 7 1
[3,] 3 8 1
[4,] 4 9 2
[5,] 5 10 2
Then you can use split:
> lapply( split( mat[,1:2], mat[,3] ), matrix, ncol=2)
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10
The lapply of matrix is necessary because split drops the attributes that make a vector a matrix, so you need to add them back in.
Yet another example:
#test data
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
#make a list storing a matrix for each id as components
result <- lapply(by(mat,mat[,3],identity),as.matrix)
Final product:
> result
$`1`
V1 V2 V3
1 1 6 1
2 2 7 1
3 3 8 1
$`2`
V1 V2 V3
4 4 9 2
5 5 10 2
If you have a matrix A, this will get the first two columns when the third column is 1:
A[A[,3] == 1,c(1,2)]
You can use this to obtain matrices for any value in the third column.
Explanation: A[,3] == 1 returns a vector of booleans, where the i-th position is TRUE if A[i,3] is 1. This vector of booleans can be used to index into a matrix to extract the rows we want.
Disclaimer: I have very little experience with R, this is the MATLAB-ish way to do it.
split.data.frame could be used also to split a matrix.
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
x <- split.data.frame(mat[,-3], mat[,3])
x
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10
str(x)
#List of 2
# $ 1: num [1:3, 1:2] 1 2 3 6 7 8
# $ 2: num [1:2, 1:2] 4 5 9 10
Or split the index and and use it in lapply to subset.
lapply(split(seq_along(mat[,3]), mat[,3]), \(i) mat[i, -3, drop=FALSE])
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10
This is a functional version of pedrosorio's idea:
getthird <- function(mat, idx) mat[mat[,3]==idx, 1:2]
sapply(unique(mat[,3]), getthird, mat=mat) #idx gets sent the unique values
#-----------
[[1]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[[2]]
[,1] [,2]
[1,] 4 9
[2,] 5 10
We can use by or tapply
> by(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
mat[, 3]: 1
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
------------------------------------------------------------
mat[, 3]: 2
[,1] [,2]
[1,] 4 9
[2,] 5 10
> tapply(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10