Delete specific columns - r

I have
F <- structure(c(0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0), .Dim = c(3L,
5L))
How can I remove from F the columns that have less than 2 consecutive zero?
Thx!

We may use rle to determine the consecutive values i.e. 0 and create a logical condition with lengths by looping over the column (apply, MARGIN = 2)
F[,!apply(F, 2, function(x) with(rle(!x),
any(lengths >= 2 & values))), drop = FALSE]
-output
[,1] [,2]
[1,] 0 0
[2,] 1 1
[3,] 1 1
If it is the opposite, just remove the !
F[,apply(F, 2, function(x) with(rle(!x),
any(lengths >= 2 & values))), drop = FALSE]
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 0 0 0
[3,] 0 0 0

A slightly different approach with rle applied over the columns:
F[, apply(F, 2, \(x) with(rle(x), any(lengths[values == 0] >= 2)))]
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 0 0 0
[3,] 0 0 0

Using pure base, no extra functions, as one-liner:
U = F[, apply((F[-1,]==0) & (F[-nrow(F),]==0), 2, any)]
Breakdown:
U = F[ # Select...
, # ...all the rows in the matrix...
apply( # ...that have...
(F[-nrow(F),]==0) & (F[-1,]==0), # ...one value = 0 and the next value = 0
2, # ...in columns (i.e. 2nd dimension)....
any # ...anywhere in the column.
)
]

Related

Calculating 4 or n Sum in R

I am trying to practice LeetCode problems for Data Scientist interviews in R and One of the question I came across is foursum. To solve this, I am trying to generate all the different four combinations and calculating the sum using apply function. Is there a better way to optimize it in R without using combn?
GetFourSumCombinations <- function(TestVector,Target){
CombinationPairs = combn(TestVector, 4) ## Get all the combinations
SumOfAllCombinations = apply(CombinationPairs, 2, sum)
TargetElements = which(SumOfAllCombinations == Target)
return(CombinationPairs[,TargetElements])
}
## OutPut:
TestVector = c(1, 0, -1, 0, -2, 2), Target = 0
GetFourSumCombinations(TestVector,0)
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 0 -1 0
[3,] -1 -2 -2
[4,] 0 2 2
Here is a bit shorter version
GetFourSumCombinations <- function(TestVector,Target){
vals <- combn(TestVector, 4)
vals[, colSums(vals) == Target]
}
GetFourSumCombinations(TestVector, Target)
# [,1] [,2] [,3]
#[1,] 1 1 0
#[2,] 0 -1 0
#[3,] -1 -2 -2
#[4,] 0 2 2
data
TestVector <- c(1, 0, -1, 0, -2, 2)
Target = 0
Run combn , convert that to a data.frame and then Filter out the desired columns. This has a one-line body and no subscripting.
target4 <- function(x, target = 0) {
Filter(function(x) sum(x) == target, as.data.frame(combn(x, 4)))
}
TestVector <- c(1, 0, -1, 0, -2, 2)
target4(TestVector)
giving:
V1 V9 V14
1 1 1 0
2 0 -1 0
3 -1 -2 -2
4 0 2 2
2) Longer but does not use combn.
target4a <- function(x, target = 0) {
g <- do.call("expand.grid", rep(list(seq_along(x)), 4))
ok <- apply(g, 1, function(x) all(diff(x) > 0))
g2 <- apply(g[ok, ], 1, function(ix) x[ix])
g2[, colSums(g2) == target]
}
target4a(TestVector)
3) or perhaps break up (2) into a custom combn and (1).
combn4 <- function(x) {
g <- do.call("expand.grid", rep(list(seq_along(x)), 4))
ok <- apply(g, 1, function(x) all(diff(x) > 0))
apply(g[ok, ], 1, function(ix) x[ix])
}
target4b <- function(x, target = 0) {
Filter(function(x) sum(x) == target, as.data.frame(combn4(x)))
}
target4b(TestVector)

Matrix fill zero rows with one

I have an Adjacency matrix A:
[1] [2] [3]
[1] 1 0 1
[2] 0 0 0
[3] 0 0 0
i want a new matrix B filled with one in all rows with only zero so B:
[1] [2] [3]
[1] 0 0 0
[2] 1 1 1
[3] 1 1 1
how can i do that i R
Since an adjacency matrix is nonnegative, you can just test for zero rowSums:
A <- matrix(c(1, 0, 1,
0, 0, 0,
0, 0, 0), byrow = TRUE, nrow = 3)
B <- A * 0 # 0-matrix same dimensions as A
B[rowSums(A) == 0, ] <- 1
apply() can be useful here. Try
t(apply(A, 1, function(x){ifelse(x != sum(x), 0, 1)}))
The apply shown returns a vector with one element per row of A having the value TRUE if that row is all zeroes and FALSE otherwise. Assign that vector to A using recycling so that each column of A is filled with it. Such assignment will also have the effect of converting FALSE and TRUE to 0 and 1. The expression is compact, seems relatively straightforward to understand and does not use any packages.
replace(A, TRUE, apply(A == 0, 1, all))
giving:
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 1 1
[3,] 1 1 1
This variation is even shorter but is slightly trickier:
replace(A, TRUE, apply(!A, 1, all))
Note
The input A in reproducible form is:
A <- matrix(c(1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), 3)
Another way
m <- matrix( c(1, 0, 1, 0, 0, 0, 0, 0, 0), nrow = 3, byrow = T )
> m
[,1] [,2] [,3]
[1,] 1 0 1
[2,] 0 0 0
[3,] 0 0 0
m <- matrix(rep(ifelse(rowSums(m) == 0, 1, 0), dim(m)[1]), nrow = dim(m)[1])
> m
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 1 1
[3,] 1 1 1

How do I generate following matrix?

I have solution matrix(say A) to the indefinite equation x1+x2+x3+x4 = 6. Also, I have another matrix(say B) with columns are
0 1 0 1
0 0 1 1
I want to generate matrices using rows of A and the columns of B.
For an example, let (2,0,1,3) is the one solution(one row) of the matrix A. Then, the columns of my new matrix are
0 0 0 1 1 1
0 0 1 1 1 1
Columns of this new matrix are the multiples of columns of B. i.e., first column 2-times, third column 1-time and fourth column 3-times. I want to use this procedure all the rows of matrix A.
use rep:
b <- matrix(c(0, 0, 1, 0, 0, 1, 1, 1), nrow = 2)
a <- c(2, 0, 1, 3)
b[, rep(1:ncol(b), a)]
if a has many rows:
lapply(1:nrow(a), function(i) b[, rep(1:ncol(b), a[i, ])])
> B <- rbind(c(0, 1, 0, 1), c( 0, 0, 1, 1))
> A <- rbind(c(2,0,1,3), c(2,0,1,3))
> do.call(rbind, lapply(1:nrow(A), function(jj) t(sapply(1:nrow(B), function(j) do.call(c, lapply(1:4, function(i) rep(B[j,i], A[jj,i]))) ))))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 1 1 1
[2,] 0 0 1 1 1 1
[3,] 0 0 0 1 1 1
[4,] 0 0 1 1 1 1

Extract sub-matrices from binary matrix in R

Say binary matrix m:
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 0 0 0 0 0 0 0 0 0
# [2,] 0 0 0 0 0 0 0 0 0
# [3,] 0 0 0 1 1 1 1 0 0
# [4,] 0 0 0 1 1 1 1 0 0
# [5,] 0 0 0 1 1 1 1 0 0
# [6,] 0 0 0 0 0 0 0 0 0
# [7,] 0 1 1 0 0 0 0 1 1
# [8,] 0 1 1 0 1 1 0 1 1
# [9,] 0 0 0 0 1 1 0 1 1
# [10,] 0 0 0 0 1 1 0 0 0
m <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1,
1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 0), .Dim = c(10L, 9L))
How we can extract those 1-valued sub-matrices? e.g.
m[7:9,8:9]
# [,1] [,2]
#[1,] 1 1
#[2,] 1 1
#[3,] 1 1
The point is that I want to extract them algorithmtically not indexing them explicitly like m[7:9,8:9].
The input is a binary matrix
List of sub-matrices as output (so list of four matrices of dim 3*4, 2*2, 3*2 and 3*2)
Sub-matrices are 1-valued rectangular
The border of the sub-matrices are secured with zeros.
I'd treat it as a spatial problem where you have a raster and want to detect regions of connected cells.
library(raster)
r <- raster(m)
library(igraph)
rc <- clump(r)
plot(rc, col = rainbow(rc#data#max))
m1 <- as.matrix(rc)
lapply(seq_len(rc#data#max), function(x) {
inds <- which(m1 == x, arr.ind = TRUE)
nrow <- diff(range(inds[, "row"])) + 1
ncol <- diff(range(inds[, "col"])) + 1
matrix(1, ncol = ncol, nrow = nrow)
})
#[[1]]
# [,1] [,2] [,3] [,4]
#[1,] 1 1 1 1
#[2,] 1 1 1 1
#[3,] 1 1 1 1
#
#[[2]]
# [,1] [,2]
#[1,] 1 1
#[2,] 1 1
#
#[[3]]
# [,1] [,2]
#[1,] 1 1
#[2,] 1 1
#[3,] 1 1
#
#[[4]]
# [,1] [,2]
#[1,] 1 1
#[2,] 1 1
#[3,] 1 1
Use focal in the raster package with an appropriate weighting matrix w. It. convolves w with m giving a matrix the same dimensions as m with the value of big at each upper left corner and other values elsewhere so comparing it to big gives a logical matrix which is TRUE at upper left corners of rectangles. Using which we get rc which has one row per rectange and two columns representing the i and j coordinates of the upper left of that rectangle. The Map call iterates over the upper left coordinates invoking genmap on each. genmap uses rle (as defined in the rl function) to find the length of the run of ones in each coordinate direction and returns a matrix of ones having those dimensions.
library(raster)
big <- 100
r <- raster(m)
w <- matrix(0, 3, 3); w[1:2, 1:2] <- 1; w[2, 2] <- big
rc <- which(as.matrix(focal(r, w, pad = TRUE, padValue = 0)) == big, arr = TRUE)
rl <- function(x) rle(x)$lengths[1]
genmat <- function(i, j) matrix(1, rl(m[i:nrow(m), j]), rl(m[i, j:ncol(m)]))
Map(genmat, rc[, 1], rc[, 2])
giving:
[[1]]
[,1] [,2]
[1,] 1 1
[2,] 1 1
[[2]]
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 1 1
[3,] 1 1 1 1
[[3]]
[,1] [,2]
[1,] 1 1
[2,] 1 1
[3,] 1 1
[[4]]
[,1] [,2]
[1,] 1 1
[2,] 1 1
[3,] 1 1
Updates Simplified code.
A rather long-winded answer, but you can do this via image labeling as I did in this SO answer. This will extend nicely to non-rectangular blobs of 1's.
find.contiguous <- function(img, x, bg) {
## we need to deal with a single (row,col) matrix index
## versus a collection of them in a two column matrix separately.
if (length(x) > 2) {
lbl <- img[x][1]
img[x] <- bg
xc <- x[,1]
yc <- x[,2]
} else {
lbl <- img[x[1],x[2]]
img[x[1],x[2]] <- bg
xc <- x[1]
yc <- x[2]
}
## find all neighbors of x
xmin <- ifelse((xc-1) < 1, 1, (xc-1))
xmax <- ifelse((xc+1) > nrow(img), nrow(img), (xc+1))
ymin <- ifelse((yc-1) < 1, 1, (yc-1))
ymax <- ifelse((yc+1) > ncol(img), ncol(img), (yc+1))
## find all neighbors of x
x <- rbind(cbind(xmin, ymin),
cbind(xc , ymin),
cbind(xmax, ymin),
cbind(xmin, yc),
cbind(xmax, yc),
cbind(xmin, ymax),
cbind(xc , ymax),
cbind(xmax, ymax))
## that have the same label as the original x
x <- x[img[x] == lbl,]
## if there is none, we stop and return the updated image
if (length(x)==0) return(img);
## otherwise, we call this function recursively
find.contiguous(img,x,bg)
}
find.contiguous is a recursive function in which for each call it receives:
A working copy of the image img.
A collection of pixel (matrix) indices x (row,col) that belong to an object in the image img.
The background value bg
find.contiguous then proceeds to:
Set all pixels at x in img to the bg color. This marks that we have visited the pixels.
Find all neighboring pixels of x that have the same label (value) as that in x. This grows the region of the same object. Note that since x is not necessarily a single pixel, x grows geometrically so that, in fact, this function is no slouch.
If there are no more neighbors belonging to the same object, we return the updated image; otherwise, we make the recursive call.
Starting from a single pixel that correspond to an object, a call to find.contiguous will grow the region to include all the object's pixels and return an updated image where the object is replaced by the background. This process can then be repeated in a loop until there are no more objects in the image, hence the ability to extract all sub-matrices of 1's.
With your data:
m <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1,
1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 0), .Dim = c(10L, 9L))
## make a copy to img which will be converted to all-zeros in the process
## as matrices of 1's are extracted by the process
img <- m
## get all pixel coordinates that are objects
x <- which(img==1, arr.ind=TRUE)
## loop until there are no more pixels that are objects
##the output is in the list out
count <- 0
out <- list()
while (length(x) > 0) {
## choose a single (e.g., first) pixel location. This belongs to the current
## object that we will grow and remove from the image using find.contiguous
if (length(x) > 2) {
x1 <- x[1,]
}
## make the call to remove the object from img
img <- find.contiguous(img, x1, 0)
## find the remaining pixel locations belonging to objects
xnew <- which(img==1, arr.ind=TRUE)
count <- count + 1
## extract the indices for the 1's found by diffing new with x
out.ind <- x[!(x[,1] %in% xnew[,1] & x[,2] %in% xnew[,2]),]
## set it as a matrix in the output
out[[count]] <- matrix(m[out.ind],nrow=length(unique(out.ind[,1])),ncol=length(unique(out.ind[,2])))
x <- xnew
}
Your output is the list out:
print(out)
##[[1]]
## [,1] [,2]
##[1,] 1 1
##[2,] 1 1
##
##[[2]]
## [,1] [,2] [,3] [,4]
##[1,] 1 1 1 1
##[2,] 1 1 1 1
##[3,] 1 1 1 1
##
##[[3]]
## [,1] [,2]
##[1,] 1 1
##[2,] 1 1
##[3,] 1 1
##
##[[4]]
## [,1] [,2]
##[1,] 1 1
##[2,] 1 1
##[3,] 1 1
Note that you can just as easily output the locations of the extracted 1's from out.ind:

Replacing row-values in matrix by its row index

I use a r-matrix (for example [[0,0,0,1],[0,1,0,1],[1,0,0,0],[0,0,1,1]]) representing
a raster. I'd like to replace every value except 0 with its row index value. Is there something like
matrix[matrix==1] <- row_index
so that my result would look like [[0,0,0,1],[0,2,0,2],[3,0,0,0],[0,0,4,4]]?
I am using R 2.15.1 on a Mac (10.7.5) and RPY2 2.2.6 to start the R-Methods.
Or is there any other way to get reasonable results for statistical functions like histogram, chi_square etc.?
For a succinct, expressive solution, I'd be likely to use this:
m <- matrix(c(0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1),
nrow = 4, byrow = TRUE)
m[m!=0] <- row(m)[m!=0]
m
# [,1] [,2] [,3] [,4]
# [1,] 0 0 0 1
# [2,] 0 2 0 2
# [3,] 3 0 0 0
# [4,] 0 0 4 4
Hopefully all hell won't break loose for suggesting a for loop, but we'll see:
Here's your matrix
mymat <- matrix(c(0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1),
nrow = 4, byrow = TRUE)
mymat
# [,1] [,2] [,3] [,4]
# [1,] 0 0 0 1
# [2,] 0 1 0 1
# [3,] 1 0 0 0
# [4,] 0 0 1 1
Here's a for loop that uses basic subsetting to identify the cases you want to replace.
for (i in 1:nrow(mymat)) {
mymat[i, ][mymat[i, ] != 0] <- i
}
Here's the result.
mymat
# [,1] [,2] [,3] [,4]
# [1,] 0 0 0 1
# [2,] 0 2 0 2
# [3,] 3 0 0 0
# [4,] 0 0 4 4
Maybe I'm missing the point of your question, but how about
> m <- matrix(c(0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1),
+ nrow = 4, byrow = TRUE)
> m * 1:nrow(m)
[,1] [,2] [,3] [,4]
[1,] 0 0 0 1
[2,] 0 2 0 2
[3,] 3 0 0 0
[4,] 0 0 4 4
>
(ETA: R fills in matrices by column, and the itemwise product operator makes the two matrices involved conformant by replicating them column by column until they fit. The * operator here winds up multiplying each item by the row to which it belongs.)

Resources