R-Duplicate columns in a matrix - r

I have a matrix
1 2
1 3
I want to duplicate each columns three times to create a matrix like this:
1 1 1 2 2 2
1 1 1 3 3 3
I dont think I can use rep. Really appreciate any help

You can use rep in this situation, just not on the matrix itself.
This does what you want:
mat1 = cbind(c(1,1), c(2,3))
mat2 = mat1[, rep(1:2, each=3)]

You can actually do it with a single rep inside matrix.
m <- matrix(c(1, 1, 2, 2), nrow = 2)
matrix(rep(as.numeric(t(m)), each = 3), nrow = nrow(m), byrow = TRUE)
Depending on the size of your matrix this might be quicker than using apply.

Assuming your initial matrix is called m1, one option could be:
m2 <- matrix(data = apply(m1, 2, function(x) rep(x, 3)), ncol = ncol(m1)*3)

Related

Indexing using a boolean matrix in R

I'm stumped using indexing in R. I have two matrices, one with logical values and one with data. I want to use the first to index into the second one. However, I've noticed that R reorders my values when doing so.
My data looks roughly like this:
ind <- matrix(c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE), nrow = 3, ncol = 4, byrow = TRUE)
data <- matrix(c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4), nrow = 3, ncol = 4, byrow = TRUE)
Now, when indexing result <- data[ind], I was obtaining the following: c(1, 3, 4) when I was trying to obtain c(1, 4, 3).
How can I prevent R from reordering columwise? I'd appreciate any input. I'm sure it's an easy fix - I just don't know which.
Thank you!
When converting matrices to and from vectors, R makes the transformation columnwise.
as.vector(data)
[1] 1 1 1 2 2 2 3 3 3 4 4 4
As this happens to both your ind and your data this is generally not a problem.
So to retain your order, you have to transpose both matrices:
> t(data)[t(ind)]
[1] 1 4 3
PS: have you tried which with a matrix?
> which(arr.ind = T, ind)
row col
[1,] 1 1
[2,] 3 3
[3,] 2 4
Here is another base R trick (but I recommend the answer by #c0bra)
> rowSums(data * ind)
[1] 1 4 3

Creating a new column from a matrix

I have an n by 2 matrix, for example:
x <- matrix(1:4, nrow = 2, ncol = 2)
I have to create a new column which will store the result
(a11+a12)-a22, (a21+a22)-a32, ...
and so on. a32 is not there so it is considered as 0. Is there an easy way to do this in R ?
I have tried to use the apply() function with no luck. The desired output is a column with values
0
6
Something like this?
x <- matrix(1:4, nrow = 2, ncol = 2)
# obtain the row sum of x
rs = rowSums(x)
# obtain the last column from the matrix
x = x[,ncol(x)]
# remove the first value and add a 0 at the end
# since your last value will always be 0
x = x[-1]
x = c(x, 0)
rs - x

Replace one column in matrix by > 1 columns

How to easily replace a (N x 1) vector/column of a (N x M) matrix by a (N x K) matrix such that the result is a (N x (M - 1 + K)) matrix?
Example:
a <- matrix(c(1, 3, 4, 5), nrow = 2) # (2 x 2)
b <- matrix(c(1, 3, 5, 6, 7, 7), nrow = 2) # (2 x 3)
I now want to do something like this:
a[, 1, drop = FALSE] <- b # Error
which R does not like.
All I could think of is a two-step approach: attach b to a and subsequently delete column 1. Problem: it mixes the order the columns appear.
Basically, I want to have a simple drop in replacement. I am sure it is possible somehow.
You can use cbind:
cbind(b, a[,-1])
# [,1] [,2] [,3] [,4]
#[1,] 1 5 7 4
#[2,] 3 6 7 5
If you need to insert in the middle of a large matrix (say, at column N), rather than one end you can use,
cbind(a[, 1:(N-1)], b, a[, (N+1):NCOL(a)])
For a generalized version that works wherever the insert is (start, middle or end) we can use
a <- matrix(1:10, nrow = 2)
b <- matrix(c(100, 100, 100, 100, 100, 100), nrow = 2)
N <- 6 # where we want to insert
NMAX <- NCOL(a) # the largest column where we can insert
cbind(a[, 0:(N-1)], b, {if(N<NMAX) a[,(N+1):NMAX] else NULL})

Get rows which do not contain 0

I would like to make a new matrix from another matrix but only with rows which do not contain 0, how can I do that?
Here is a more vectorized way.
x <- matrix(c(0,0,0,1,1,0,1,1,1,1), ncol = 2, byrow = TRUE)
x[rowSums(x==0)==0,]
I found that it could by done very simply
x <- matrix(c(0,0,0,1,1,0,1,1,1,1), ncol = 2, byrow = TRUE)
y <- cbind (x[which(x[,1]*x[,2] >0), 1:2])
I am only piecing together the great suggestions others have already given. I like the ability to store this as a function and generalize to values besides 1 including categorcal values (also selects positively or negatively using the select argument):
v.omit <- function(dataframe, v = 0, select = "neg") {
switch(select,
neg = dataframe[apply(dataframe, 1, function(y) !any(y %in% (v))), ],
pos = dataframe[apply(dataframe, 1, function(y) any(y %in% (v))), ])
}
Let's try it.
x <- matrix(c(0,0,0,1,1,0,1,1,1,1,NA,1), ncol = 2, byrow = TRUE)
v.omit(x)
v.omit(mtcars, 0)
v.omit(mtcars, 1)
v.omit(CO2, "chilled")
v.omit(mtcars, c(4,3))
v.omit(CO2, c('Quebec', 'chilled'))
v.omit(x, select="pos")
v.omit(CO2, c('Quebec', 'chilled'), select="pos")
v.omit(x, NA)
v.omit(x, c(0, NA))
Please do not mark my answer as the correct one as others have answered before me, this is just to extend the conversation. Thanks for the code and the question.
I'm sure there are better ways, but here's one approach. We'll use apply() and the all() function to create a boolean vector to index into the matrix of interest.
x <- matrix(c(0,0,0,1,1,0,1,1,1,1), ncol = 2, byrow = TRUE)
x
> x
[,1] [,2]
[1,] 0 0
[2,] 0 1
[3,] 1 0
[4,] 1 1
[5,] 1 1
> x[apply(x, 1, function(y) all(y > 0)) ,]
[,1] [,2]
[1,] 1 1
[2,] 1 1

extract unique rows with a condition in r

I have this kind of data:
x <- matrix(c(2,2,3,3,3,4,4,20,33,2,3,45,6,9,45,454,7,4,6,7,5), nrow = 7, ncol = 3)
In the real dataset, I have a huge matrix with a lot of columns.
I want to extract unique rows with respect to the first column(Id) and minimum of the third column. For instance, for this matrix I would expect
y <- matrix(c(2,3,4,20,3,9,45,4,5), nrow = 3, ncol = 3)
I tried a lot of things but I couldn't figure out.
Any help is appreciated.
Thanks in advance,
Zeray
Here's a version that is more complicated, but somewhat faster that Chase's ddply solution - some 200x faster :-)
uniqueMin <- function(m, idCol = 1L, minCol = ncol(m)) {
t(vapply(split(1:nrow(m), m[,idCol]), function(i, x, minCol) x[i, , drop=FALSE][which.min(x[i,minCol]),], m[1,], x=m, minCol=minCol))
}
And the following test code:
nRows <- 10000
nCols <- 100
ids <- nRows/5
m <- cbind(sample(ids, nRows, T), matrix(runif(nRows*nCols), nRows))
system.time( a<-uniqueMin(m, minCol=3L) ) # 0.07
system.time(ddply(as.data.frame(m), "V1", function(x) x[which.min(x$V3) ,])) # 15.72
You can use package plyr. Convert to a data.frame so you can group on the first column, then use which.min to extract the min row by group:
library(plyr)
ddply(as.data.frame(x), "V1", function(x) x[which.min(x$V3) ,])
V1 V2 V3
1 2 20 45
2 3 3 4
3 4 9 5

Resources