Creating a matrix from multiple column vectors - r

How can I create a matrix from multiple column vectors?
I know that I can easily create a data frame with column vectors:
> colA <- 1:5
> colB <- 21:25
> colC <- 31:35
> data.frame(colA, colB, colC)
colA colB colC
1 1 21 31
2 2 22 32
3 3 23 33
4 4 24 34
5 5 25 35
However, when I try matrix(), it gives me unexpected results, as shown below. How can create my desired matrix? I know I can do as.matrix(df), which nicely preserves the column names, but I'm looking for a more direct approach.
> matrix(colA, colB, colC)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[2,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[3,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[4,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[5,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[6,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[7,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[8,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[9,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[10,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[11,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[12,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[13,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[14,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[15,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[16,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[17,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[18,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[19,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[20,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[21,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
[1,] 4 5 1 2 3 4 5 1 2 3 4 5
[2,] 5 1 2 3 4 5 1 2 3 4 5 1
[3,] 1 2 3 4 5 1 2 3 4 5 1 2
[4,] 2 3 4 5 1 2 3 4 5 1 2 3
[5,] 3 4 5 1 2 3 4 5 1 2 3 4
[6,] 4 5 1 2 3 4 5 1 2 3 4 5
[7,] 5 1 2 3 4 5 1 2 3 4 5 1
[8,] 1 2 3 4 5 1 2 3 4 5 1 2
[9,] 2 3 4 5 1 2 3 4 5 1 2 3
[10,] 3 4 5 1 2 3 4 5 1 2 3 4
[11,] 4 5 1 2 3 4 5 1 2 3 4 5
[12,] 5 1 2 3 4 5 1 2 3 4 5 1
[13,] 1 2 3 4 5 1 2 3 4 5 1 2
[14,] 2 3 4 5 1 2 3 4 5 1 2 3
[15,] 3 4 5 1 2 3 4 5 1 2 3 4
[16,] 4 5 1 2 3 4 5 1 2 3 4 5
[17,] 5 1 2 3 4 5 1 2 3 4 5 1
[18,] 1 2 3 4 5 1 2 3 4 5 1 2
[19,] 2 3 4 5 1 2 3 4 5 1 2 3
[20,] 3 4 5 1 2 3 4 5 1 2 3 4
[21,] 4 5 1 2 3 4 5 1 2 3 4 5
[,26] [,27] [,28] [,29] [,30] [,31]
[1,] 1 2 3 4 5 1
[2,] 2 3 4 5 1 2
[3,] 3 4 5 1 2 3
[4,] 4 5 1 2 3 4
[5,] 5 1 2 3 4 5
[6,] 1 2 3 4 5 1
[7,] 2 3 4 5 1 2
[8,] 3 4 5 1 2 3
[9,] 4 5 1 2 3 4
[10,] 5 1 2 3 4 5
[11,] 1 2 3 4 5 1
[12,] 2 3 4 5 1 2
[13,] 3 4 5 1 2 3
[14,] 4 5 1 2 3 4
[15,] 5 1 2 3 4 5
[16,] 1 2 3 4 5 1
[17,] 2 3 4 5 1 2
[18,] 3 4 5 1 2 3
[19,] 4 5 1 2 3 4
[20,] 5 1 2 3 4 5
[21,] 1 2 3 4 5 1
Warning message:
In matrix(colA, colB, colC) :
data length [5] is not a sub-multiple or multiple of the number of rows [21]

You can use cbind to produce the desired matrix:
mat <- cbind(colA, colB, colC)
mat
# colA colB colC
# [1,] 1 21 31
# [2,] 2 22 32
# [3,] 3 23 33
# [4,] 4 24 34
# [5,] 5 25 35
class(mat)
# [1] "matrix"
You don't get the matrix you're expecting with the call of matrix(colA, colB, colC), because your arguments are getting interpreted as the first, second, and third arguments to the matrix function (aka data, nrow, and ncol). If you wanted to use the matrix function, you would need to provide your data as a single argument, with something like mat <- matrix(c(colA, colB, colC), ncol=3). If you used this syntax, you would not get the column names from the variables like we did with cbind.

Related

manipulation of list of matrices in R

I have a list of matrices, generated with the code below
a<-c(0,5,0,1,5,1,5,4,6,7)
b<-c(3,1,0,2,4,2,5,5,7,8)
c<-c(5,9,0,1,3,2,5,6,2,7)
d<-c(6,5,0,1,3,4,5,6,7,1)
k<-data.frame(a,b,c,d)
k<-as.matrix(k)
#dimnames(k)<-list(cntry,cntry)
e<-c(0,5,2,2,1,2,3,6,9,2)
f<-c(2,0,4,1,1,3,4,5,1,4)
g<-c(3,3,0,2,0,9,3,2,1,9)
h<-c(6,1,1,1,5,7,8,8,0,2)
l<-data.frame(e,f,g,h)
l<-as.matrix(l)
#dimnames(l)<-list(cntry,cntry)
list<-list(k,l)
names(list)<-2010:2011
list
list
$`2010`
a b c d
[1,] 0 3 5 6
[2,] 5 1 9 5
[3,] 0 3 2 2
[4,] 1 2 1 1
[5,] 5 4 3 3
[6,] 1 2 2 4
[7,] 5 5 5 5
[8,] 4 5 6 6
[9,] 6 7 2 7
[10,] 7 8 7 1
$`2011`
e f g h
[1,] 0 2 3 6
[2,] 5 0 3 1
[3,] 2 4 0 1
[4,] 2 1 2 1
[5,] 1 1 0 5
[6,] 2 3 9 7
[7,] 3 4 3 8
[8,] 6 5 2 8
[9,] 9 1 1 0
[10,] 2 4 9 2
In each matrix I would like to delete the rows that are smaller than 1. But when I delete in matrix "2010" the first row (because <1), all other first rows in 2010 and 2011 should be deleted. Then the third row of first column is <1, then all other third columns should be deleted and so on...
The result should look like:
a b c d
[4,] 1 2 1 1
[6,] 1 2 2 4
[7,] 5 5 5 5
[8,] 4 5 6 6
[10,] 7 8 7 1
$`2011`
e f g h
[4,] 2 1 2 1
[6,] 2 3 9 7
[7,] 3 4 3 8
[8,] 6 5 2 8
[10,] 2 4 9 2
We can use rowSums
lapply(list, function(x) x[!rowSums(x <1),])
If we need to remove the rows that are common
ind <- Reduce(`&`, lapply(list, function(x) !rowSums(x < 1)))
lapply(list, function(x) x[ind,])
# a b c d
#[1,] 1 2 1 1
#[2,] 1 2 2 4
#[3,] 5 5 5 5
#[4,] 4 5 6 6
#[5,] 7 8 7 1
#$`2011`
# e f g h
#[1,] 2 1 2 1
#[2,] 2 3 9 7
#[3,] 3 4 3 8
#[4,] 6 5 2 8
#[5,] 2 4 9 2
Update
Based on the OP's comments about removing rows where the row is greater than the standard deviation of each columns,
lapply(list, function(x) {
for(i in seq_len(ncol(x))) x <- x[!rowSums(x > sd(x[,i])),]
x
})
# get union of the row index with at least one of the elements less 1
removed <- Reduce(union, lapply(list, function(x) which(rowSums(x < 1) != 0)))
lapply(list, function(x) x[-removed, ])
$`2010`
a b c d
[1,] 1 2 1 1
[2,] 1 2 2 4
[3,] 5 5 5 5
[4,] 4 5 6 6
[5,] 7 8 7 1
$`2011`
e f g h
[1,] 2 1 2 1
[2,] 2 3 9 7
[3,] 3 4 3 8
[4,] 6 5 2 8
[5,] 2 4 9 2

How do I generate all the vectors by swapping two positions?

Suppose I have a column vector of [1 1 1 2 2 2 3 3 3] and I want to generate all the different column vectors only by switching two positions. For an example, one such vector would be
[1 1 3 2 2 2 1 3 3].
Try this (it gives you a data frame each row of which is a unique vector with 2 elements swapped from the original vector, there are 28 such unique vectors, including the original one):
v <- c(1,1,1,2,2,2,3,3,3)
unique(t(apply(t(combn(1:length(v), 2)), 1, function(x) {v[x] <- v[rev(x)]; v})))
with output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 2 2 2 3 3 3 # original one
[2,] 2 1 1 1 2 2 3 3 3 # swap 1st & 4th elements
[3,] 2 1 1 2 1 2 3 3 3 # swap 1st & 5th
[4,] 2 1 1 2 2 1 3 3 3 # ...
[5,] 3 1 1 2 2 2 1 3 3
[6,] 3 1 1 2 2 2 3 1 3
[7,] 3 1 1 2 2 2 3 3 1
[8,] 1 2 1 1 2 2 3 3 3
[9,] 1 2 1 2 1 2 3 3 3
[10,] 1 2 1 2 2 1 3 3 3
[11,] 1 3 1 2 2 2 1 3 3
[12,] 1 3 1 2 2 2 3 1 3
[13,] 1 3 1 2 2 2 3 3 1
[14,] 1 1 2 1 2 2 3 3 3
[15,] 1 1 2 2 1 2 3 3 3
[16,] 1 1 2 2 2 1 3 3 3
[17,] 1 1 3 2 2 2 1 3 3
[18,] 1 1 3 2 2 2 3 1 3
[19,] 1 1 3 2 2 2 3 3 1
[20,] 1 1 1 3 2 2 2 3 3
[21,] 1 1 1 3 2 2 3 2 3
[22,] 1 1 1 3 2 2 3 3 2
[23,] 1 1 1 2 3 2 2 3 3
[24,] 1 1 1 2 3 2 3 2 3
[25,] 1 1 1 2 3 2 3 3 2
[26,] 1 1 1 2 2 3 2 3 3
[27,] 1 1 1 2 2 3 3 2 3
[28,] 1 1 1 2 2 3 3 3 2 # swap 6th & 9th
First, we can use the utils function combn to generate all of the possible combinations of pairs of positions to swap. Here, I am assuming you don't want to swap the same number (e.g. 1 and 1), so am checking them to make sure they are different values:
allCombo <-
combn(1:length(startVec), 2)
toKeep <- apply(allCombo, 2, function(x) {
startVec[x[1]] != startVec[x[2]]
})
Then, apply along those that you are keeping, and swap the positions.
outVecs <- apply(allCombo[ , toKeep], 2, function(x){
temp <- startVec
temp[x] <- startVec[rev(x)]
return(temp)
})
This returns as a vector, but you can convert it to a list, which may be easier to manage, like so:
outVecsInList <-
as.list(as.data.frame(outVecs))
head(outVecsInList) shows:
$V1
[1] 2 1 1 1 2 2 3 3 3
$V2
[1] 2 1 1 2 1 2 3 3 3
$V3
[1] 2 1 1 2 2 1 3 3 3
$V4
[1] 3 1 1 2 2 2 1 3 3
$V5
[1] 3 1 1 2 2 2 3 1 3
$V6
[1] 3 1 1 2 2 2 3 3 1

all combinations of n numbers

I would like to generate all unique combinations of n numbers without replacement. For example if n = 4, there are gamma(4+1) combinations shown here:
combos4 <- read.table(text='
x1 x2 x3 x4
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
', header = TRUE)
I can obtain these 24 combinations using the following code:
combos <- expand.grid(x1=1:4, x2=1:4, x3=1:4, x4=1:4)
my.combos <- combos[apply(combos,1,function(x) length(unique(x)) == ncol(combos)),]
my.combos
However, I always assumed there was an easier way in base R. Specifically, I always assumed the function combn() could return these combinations. However, if combn() can I now realize I do not know how to do it.
The expand.grid approach seems a little inefficient, especially when n is large.
In searching the internet and StackOverflow I have found several similar questions that use the algorithm tag and the code posted in answers, when provided, is not simple. Sorry if this is a duplicate or if I am missing something obvious.
The combinat package has a permn function
library(combinat)
permn(1:4)
> permn(1:4)
[[1]]
[1] 1 2 3 4
[[2]]
[1] 1 2 4 3
[[3]]
[1] 1 4 2 3
[[4]]
[1] 4 1 2 3
[[5]]
[1] 4 1 3 2
[[6]]
[1] 1 4 3 2
[[7]]
[1] 1 3 4 2
[[8]]
[1] 1 3 2 4
[[9]]
[1] 3 1 2 4
[[10]]
[1] 3 1 4 2
[[11]]
[1] 3 4 1 2
[[12]]
[1] 4 3 1 2
[[13]]
[1] 4 3 2 1
[[14]]
[1] 3 4 2 1
[[15]]
[1] 3 2 4 1
[[16]]
[1] 3 2 1 4
[[17]]
[1] 2 3 1 4
[[18]]
[1] 2 3 4 1
[[19]]
[1] 2 4 3 1
[[20]]
[1] 4 2 3 1
[[21]]
[1] 4 2 1 3
[[22]]
[1] 2 4 1 3
[[23]]
[1] 2 1 4 3
[[24]]
[1] 2 1 3 4
gtools has a permutations function that might help:
library(gtools)
permutations(4,4,1:4)
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 1 2 4 3
## [3,] 1 3 2 4
## [4,] 1 3 4 2
## [5,] 1 4 2 3
## [6,] 1 4 3 2
## [7,] 2 1 3 4
## [8,] 2 1 4 3
## [9,] 2 3 1 4
## [10,] 2 3 4 1
## [11,] 2 4 1 3
## [12,] 2 4 3 1
## [13,] 3 1 2 4
## [14,] 3 1 4 2
## [15,] 3 2 1 4
## [16,] 3 2 4 1
## [17,] 3 4 1 2
## [18,] 3 4 2 1
## [19,] 4 1 2 3
## [20,] 4 1 3 2
## [21,] 4 2 1 3
## [22,] 4 2 3 1
## [23,] 4 3 1 2
## [24,] 4 3 2 1
and runs pretty fast:
user system elapsed
0.001 0.000 0.000
There is also iterpc package.
> I = iterpc(4, ordered=TRUE)
> getall(I)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 4 3
[3,] 1 3 2 4
[4,] 1 3 4 2
[5,] 1 4 2 3
[6,] 1 4 3 2
[7,] 2 1 3 4
[8,] 2 1 4 3
[9,] 2 3 1 4
[10,] 2 3 4 1
...
You can also do it iteratively
> getnext(I)
[1] 1 2 3 4
> getnext(I)
[1] 1 2 4 3
> getnext(I)
[1] 1 3 2 4
> getnext(I)
[1] 1 3 4 2

find unique elements in matrix based on a subset of columns

I have a table which i want to transform
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 4 9 1 2
[5,] 2 3 5 1 2
[6,] 2 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
I want to filter the data in a way that rows which just differ by there number in the first column are removed (not completely but only the duplicate). So for rows 1 and 4 only row 1 should remain in the table. Or for row 3 and 9 only row 9 should remain. It is important that the information in the first column is remained and that the earliest occurance of the row remaisn in the table not the other incidences.
You can use duplicated:
mat[!duplicated(as.data.frame(mat[, -1])), ]
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
where mat is the name of your matrix.
Try using duplicated function:
mymx <- matrix(c(1,4,9,1,2 ,1,3,5,1,2 ,1,1,6,1,2 ,2,4,9,1,2 ,2,3,5,1,2 ,2,1,6,1,2 ,2,7,2,2,2 ,3,3,5,3,4 ,3,1,6,3,4 ,3,7,2,3,5 ,3,4,9,3,5), ncol=5, byrow=T)
mymx[!duplicated(mymx[,-1]),]
> mymx[!duplicated(mymx[,-1]),]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 7 2 2 2
[5,] 3 3 5 3 4
[6,] 3 1 6 3 4
[7,] 3 7 2 3 5
[8,] 3 4 9 3 5

How do I make a matrix from a list of vectors in R?

Goal: from a list of vectors of equal length, create a matrix where each vector becomes a row.
Example:
> a <- list()
> for (i in 1:10) a[[i]] <- c(i,1:5)
> a
[[1]]
[1] 1 1 2 3 4 5
[[2]]
[1] 2 1 2 3 4 5
[[3]]
[1] 3 1 2 3 4 5
[[4]]
[1] 4 1 2 3 4 5
[[5]]
[1] 5 1 2 3 4 5
[[6]]
[1] 6 1 2 3 4 5
[[7]]
[1] 7 1 2 3 4 5
[[8]]
[1] 8 1 2 3 4 5
[[9]]
[1] 9 1 2 3 4 5
[[10]]
[1] 10 1 2 3 4 5
I want:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
One option is to use do.call():
> do.call(rbind, a)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
simplify2array is a base function that is fairly intuitive. However, since R's default is to fill in data by columns first, you will need to transpose the output. (sapply uses simplify2array, as documented in help(sapply).)
> t(simplify2array(a))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
The built-in matrix function has the nice option to enter data byrow. Combine that with an unlist on your source list will give you a matrix. We also need to specify the number of rows so it can break up the unlisted data. That is:
> matrix(unlist(a), byrow=TRUE, nrow=length(a) )
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
Not straightforward, but it works:
> t(sapply(a, unlist))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
t(sapply(a, '[', 1:max(sapply(a, length))))
where 'a' is a list.
Would work for unequal row size
> library(plyr)
> as.matrix(ldply(a))
V1 V2 V3 V4 V5 V6
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5

Resources