From tree list to a matrix in R - r

Apologies, if the question is too basic. What would be an effective approach/idea (in R) to convert
list(c(1), c(1,2), c(1,2,3), c(1,2,3,4))
to square matrix form
[,1] [,2] [,3] [,4]
[1,] 1 NA NA NA
[2,] 1 2 NA NA
[3,] 1 2 3 NA
[4,] 1 2 3 4
I suppose there is some quick dynamic way to append just the right number of NA values and then convert to a matrix.
Naturally, the size of the (square) matrix can change).
Thanks in advance for your time.

You can use
## create the list
x <- Map(":", 1, 1:4)
ml <- max(lengths(x))
do.call(rbind, lapply(x, "length<-", ml))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
Or you could do
library(data.table)
as.matrix(unname(rbindlist(lapply(x, as.data.frame.list), fill = TRUE)))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
And one more for good measure ... Fore!
m <- stringi::stri_list2matrix(x, byrow = TRUE)
mode(m) <- "numeric"
m
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4

Related

How to merge two excel spreadsheets in r? [duplicate]

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])

Access (and fill) different rows in array

Let's say I have a 3-dimensional array:
a <- array(dim = c(3, 2, 3))
and a vector with indices for the 1st dimension:
ind <- c(1,2,3)
Now I want to put a number (e.g. 1) into the rows (ind) across 2nd and 2rd dimension.
a[ind,,] <- matrix(1, ncol = 2, nrow = length(ind))
clearly does not work! A loop would work, however is there a better solution?
The result should be:
, , 1
[,1] [,2]
[1,] 1 1
[2,] NA NA
[3,] NA NA
, , 2
[,1] [,2]
[1,] NA NA
[2,] 1 1
[3,] NA NA
, , 3
[,1] [,2]
[1,] NA NA
[2,] NA NA
[3,] 1 1
Using mapply.
mapply(function(x, y) {a[x,,y] <<- 1;a}, ind, ind)
a
# , , 1
#
# [,1] [,2]
# [1,] 1 1
# [2,] NA NA
# [3,] NA NA
#
# , , 2
#
# [,1] [,2]
# [1,] NA NA
# [2,] 1 1
# [3,] NA NA
#
# , , 3
#
# [,1] [,2]
# [1,] NA NA
# [2,] NA NA
# [3,] 1 1
Or, for this specific case where x and y is the same, just:
sapply(ind, function(x) {a[x,,x] <<- 1;a})
Note: If you find the console output unnecessary wrap an invisible() around the code.
Another alternative that replaces everything in one step. It takes advantage of ['s operations when its argument is a matrix, each column indexing on a dimension of the array.
To confirm that it is putting values where we intend, I'll modify the replacement matrix to be 1:6.
# original data
a <- array(dim = c(3, 2, 3))
# the replacement matrix
r <- matrix(1:6, ncol = 2, nrow = length(ind))
# this will be what we index `a` on
m <- expand.grid(ind, seq_len(dim(a)[2]))
m$Var3 <- m$Var1 # repeat `ind` in the third dimension
m
# Var1 Var2 Var3
# 1 1 1 1
# 2 2 1 2
# 3 3 1 3
# 4 1 2 1
# 5 2 2 2
# 6 3 2 3
And the replacement:
a[as.matrix(m)] <- r
a
# , , 1
# [,1] [,2]
# [1,] 1 4
# [2,] NA NA
# [3,] NA NA
# , , 2
# [,1] [,2]
# [1,] NA NA
# [2,] 2 5
# [3,] NA NA
# , , 3
# [,1] [,2]
# [1,] NA NA
# [2,] NA NA
# [3,] 3 6

subset a matrix, and get NA if index is not valid

I am trying to subset a matrix to always get a 3*3 matrix.
For example, the matrix being subset is a<-matrix(1:15,3,5), usually when I subset it using a[0:2,0:2], I get:
[,1] [,2]
[1,] 1 4
[2,] 2 5
But I want to get something like:
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA 1 4
[3,] NA 2 5
Force all your 0's to NAs when you select, as well as any 'out-of-bounds' values:
ro <- 0:2
co <- 0:2
a[replace(ro,ro == 0 | ro > nrow(a),NA),
replace(co,co == 0 | co > ncol(a),NA)]
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] NA 1 4
#[3,] NA 2 5
This will even work with combinations of the parts you want missing:
ro <- c(1,0,2)
co <- 0:2
a[replace(ro,ro == 0 | ro > nrow(a),NA),
replace(co,co == 0 | co > ncol(a),NA)]
# [,1] [,2] [,3]
#[1,] NA 1 4
#[2,] NA NA NA
#[3,] NA 2 5
You could create your own padding function to fill in space less than 3x3 by NA values
padmatrix <- function(a, dim=c(3,3)) {
stopifnot(all(dim(a)<=dim))
cbind(rep(NA,dim[2]-ncol(a)), rbind(rep(NA,dim[1]-nrow(a)), a))
}
padmatrix(a[1:2, 1:2])
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] NA 1 4
# [3,] NA 2 5

create new matrix with new dimension and omitting NA values

I have a matrix with some NA values
for example:
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
I want to create new matrix with data from my matrix above with new dimension and no NA value. (it is ok to have NA only some last elements)
something like:
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 NA
[4,] 4 10 NA
I would appreciate if anyone can help me.
Thanks
Something like this as well:
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
matrix(c(na.omit(c(m)), rep(NA, sum(is.na(m)))), nrow=4)
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# create an array of the appropriate class and dimension (filled with NA values)
dims <- c(4, 3)
md <- array(m[0], dim=dims)
# replace first "n" values with non-NA values from m
nonNAm <- na.omit(c(m))
md[seq_along(nonNAm)] <- nonNAm
md
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
Yet another attempt. This will keep the order of the values in column order as a matrix usually would. E.g.:
mat <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 7 11 NA
Now change a value to check it doesn't affect the ordering.
mat[7] <- 20
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 20 11 NA
You can then specify whatever dimensions you feel like to the dim= argument:
array(mat[order(is.na(mat))],dim=c(4,3))
# [,1] [,2] [,3]
#[1,] 1 6 11
#[2,] 2 20 12
#[3,] 3 8 NA
#[4,] 4 10 NA
This is fairly straightforward if you want to preserve order column-wise or row-wise.
originalMatrix <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
newMatrixNums <- originalMatrix[!is.na(originalMatrix)]
[1] 1 2 3 4 6 7 8 10 11 12
Pad with NA:
newMatrixNums2 <- c(newMatrixNums,rep(NA,2))
Column-wise:
matrix(newMatrixNums2,nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 8 12
[2,] 2 6 10 NA
[3,] 3 7 11 NA
Row-wise:
matrix(newMatrixNums2,nrow=3,byrow=T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 6 7 8 10
[3,] 11 12 NA NA
Here's one way:
# Reproducing your data
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# Your desired dimensions
dims <- c(4, 3)
array(c(na.omit(c(m)), rep(NA, prod(dims) - length(na.omit(c(m))))), dim=dims)
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA
This can do the job but dunno whether it is a good way.
list1 <- m[m]
list2 <- m[!is.na(m)]
element1 <- list2
element2 <- rep(NA, (length(list1)-length(list2)))
newm <- matrix(c(element1,element2), nrow=4)
If you increase the length of a numeric vector with length(x)<- without assigning values to the new elements, the new values are given NA as their value. So length(M2) <- length(M) takes the shorter M2 vector and makes it the same length as M by adding NA values to the new elements.
## original
> (M <- matrix(c(1:4,NA,6:8,NA,10:12), nrow = 3))
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 NA 8 11
# [3,] 3 6 NA 12
## new
> M2 <- M[!is.na(M)]; length(M2) <- length(M)
> matrix(M2, ncol(M))
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA

In R how to control the cycling direction of values when copying in an array?

I would like to copy some values across rows but the default seems to be cycling over columns, is there an elegant way to achieve what I want ?
The following code I have is:
> w = array(NA,dim=c(4,2))
> w
[,1] [,2]
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] NA NA
> w[2:4,] = c(2,3)
> w
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 3 2
[4,] 2 3
But I would like the values to be cycled over rows to obtain:
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3
You can write a new transposed assignment function:
`t<-` <- function(x, value)
{
t(matrix(value, nrow=ncol(x), ncol=nrow(x)))
}
Result:
> w = array(NA,dim=c(4,2))
> t(w[2:4,]) = c(2,3)
> w
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3
Create the matrix row-wise, then assign it:
w[2:4, ] <- matrix(c(2,3), nrow=3, ncol=2, byrow=TRUE)
Another elegant way :-)
w[2:4,] <- rep(c(2,3),each=length(2:4))
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3
w[2:4, ] <- matrix(c(2,3),nrow=1)[ rep(1,3), ]

Resources