I am trying to practice LeetCode problems for Data Scientist interviews in R and One of the question I came across is foursum. To solve this, I am trying to generate all the different four combinations and calculating the sum using apply function. Is there a better way to optimize it in R without using combn?
GetFourSumCombinations <- function(TestVector,Target){
CombinationPairs = combn(TestVector, 4) ## Get all the combinations
SumOfAllCombinations = apply(CombinationPairs, 2, sum)
TargetElements = which(SumOfAllCombinations == Target)
return(CombinationPairs[,TargetElements])
}
## OutPut:
TestVector = c(1, 0, -1, 0, -2, 2), Target = 0
GetFourSumCombinations(TestVector,0)
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 0 -1 0
[3,] -1 -2 -2
[4,] 0 2 2
Here is a bit shorter version
GetFourSumCombinations <- function(TestVector,Target){
vals <- combn(TestVector, 4)
vals[, colSums(vals) == Target]
}
GetFourSumCombinations(TestVector, Target)
# [,1] [,2] [,3]
#[1,] 1 1 0
#[2,] 0 -1 0
#[3,] -1 -2 -2
#[4,] 0 2 2
data
TestVector <- c(1, 0, -1, 0, -2, 2)
Target = 0
Run combn , convert that to a data.frame and then Filter out the desired columns. This has a one-line body and no subscripting.
target4 <- function(x, target = 0) {
Filter(function(x) sum(x) == target, as.data.frame(combn(x, 4)))
}
TestVector <- c(1, 0, -1, 0, -2, 2)
target4(TestVector)
giving:
V1 V9 V14
1 1 1 0
2 0 -1 0
3 -1 -2 -2
4 0 2 2
2) Longer but does not use combn.
target4a <- function(x, target = 0) {
g <- do.call("expand.grid", rep(list(seq_along(x)), 4))
ok <- apply(g, 1, function(x) all(diff(x) > 0))
g2 <- apply(g[ok, ], 1, function(ix) x[ix])
g2[, colSums(g2) == target]
}
target4a(TestVector)
3) or perhaps break up (2) into a custom combn and (1).
combn4 <- function(x) {
g <- do.call("expand.grid", rep(list(seq_along(x)), 4))
ok <- apply(g, 1, function(x) all(diff(x) > 0))
apply(g[ok, ], 1, function(ix) x[ix])
}
target4b <- function(x, target = 0) {
Filter(function(x) sum(x) == target, as.data.frame(combn4(x)))
}
target4b(TestVector)
Related
I have a huge matrix and I need to divide each column of it to its sum (if it is not zero).
I have used a loop but since the matrix is very big, it takes a long time to do it.
sum_D<- colSums(R_t)
for(i in 1:NR){
if(sum_D[i]>0){
R_t[,i]<-c(as.numeric(R_t[,i])/sum_D[i])
}
}
then I have write this code but its result is not a matrix.
matrixp<- apply(X=R_1, MARGIN=2, FUN=ColpSum)
ColpSum<-function(x){
x<-as.matrix(x)
if(colSums(x)==0){
return(0)
}
else{
return(x/colSums(x))
}
}
How can I solve the problem?
For example:
|1|2|3|4|
|:----|:----|:----|:----|
|2|0|0|0|
|0|1|0|0|
|0|1|0|0|
results:
|1|2|3|4|
|:----|:----|:----|:----|
|1|0|0|0|
|0|0.5|0|0|
|0|0.5|0|0|
data:
test_matrix <- matrix(c(1,2,3,0,0,0,3,2,1),nrow=3)
base R approach:
ColSum2<-function(x){
#x<-as.matrix(x)
if(sum(x)==0){
return(1)
}
else{
return(sum(x))
}
}
sum_value <- apply(test_matrix,2,ColSum2)
t(t(test_matrix)/sum_value)
data.frame approach:
ColpSum<-function(x){
#x<-as.matrix(x)
if(sum(x)==0){
return(0)
}
else{
return(x/sum(x))
}
}
library(dplyr)
test_matrix%>%as.data.frame()%>%mutate_all(ColpSum)%>%as.matrix()
x <- matrix(c(2,0,0,0,1,1,0,0,0,0,0,0), nrow = 3L)
cs_x <- colSums(x)
cols2div <- which(cs_x > 0)
x[, cols2div] <- vapply(cols2div, \(i) x[, i] / cs_x[i], numeric(nrow(x)))
[,1] [,2] [,3] [,4]
[1,] 1 0.0 0 0
[2,] 0 0.5 0 0
[3,] 0 0.5 0 0
I would use sweep() and then replace NAs, i.e.
m3 <- sweep(m2, 2, colSums(m2), '/')
m3[] <- replace(m3, is.na(m3), 0)
[,1] [,2] [,3] [,4]
[1,] 1 0.0 0 0
[2,] 0 0.5 0 0
[3,] 0 0.5 0 0
DATA
structure(c(2, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0), dim = 3:4)
if m is your matrix:
cs <- colSums(m)
cs[cs == 0] = 1
apply(m, 1, \(row) row/cs)
I have
F <- structure(c(0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0), .Dim = c(3L,
5L))
How can I remove from F the columns that have less than 2 consecutive zero?
Thx!
We may use rle to determine the consecutive values i.e. 0 and create a logical condition with lengths by looping over the column (apply, MARGIN = 2)
F[,!apply(F, 2, function(x) with(rle(!x),
any(lengths >= 2 & values))), drop = FALSE]
-output
[,1] [,2]
[1,] 0 0
[2,] 1 1
[3,] 1 1
If it is the opposite, just remove the !
F[,apply(F, 2, function(x) with(rle(!x),
any(lengths >= 2 & values))), drop = FALSE]
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 0 0 0
[3,] 0 0 0
A slightly different approach with rle applied over the columns:
F[, apply(F, 2, \(x) with(rle(x), any(lengths[values == 0] >= 2)))]
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 0 0 0
[3,] 0 0 0
Using pure base, no extra functions, as one-liner:
U = F[, apply((F[-1,]==0) & (F[-nrow(F),]==0), 2, any)]
Breakdown:
U = F[ # Select...
, # ...all the rows in the matrix...
apply( # ...that have...
(F[-nrow(F),]==0) & (F[-1,]==0), # ...one value = 0 and the next value = 0
2, # ...in columns (i.e. 2nd dimension)....
any # ...anywhere in the column.
)
]
I have a matrix
new<-matrix(9,4,4)
new[1,1]<-0
new[2,1]<-0
m1<-matrix(0,2,1)
m2<-matrix(0,1,2)
the matrices thus looks like this:
m1:
0
0
m2:
0 0
new:
0 9 9 9
0 9 9 9
9 9 9 9
9 9 9 9
I now want to check if this matrix contains the matrices m1 or m2.
So I did
m1 %in% new
m2 %in% new
and obtain TRUE TRUE for both
but would like to get TRUE for m1 and FALSE for m2
As your example matrix m1 and m2 can be converted to vectors you can use my previous answer at Check if vector in a matrix.
library(zoo)
any(apply(new, 2, rollapply, length(m1), identical, c(m1)))
#[1] TRUE
any(apply(new, 1, rollapply, length(m2), identical, c(m2)))
#[1] FALSE
A vectorised solution can be to use which with argument of arr.ind = TRUE and play with the rows of the matrix. i.e.
i_new <- which(new == 0, arr.ind = TRUE)[,1]
i_m1 <- which(m1 == 0, arr.ind = TRUE)[,1]
i_m2 <- which(m2 == 0, arr.ind = TRUE)[,1]
Reduce(`-`, i_new) == Reduce(`-`, i_m1)
#[1] TRUE
Reduce(`-`, i_new) == Reduce(`-`, i_m2)
#[1] FALSE
To see how and why this works you simply need to investigate the i_*, i.e.
which(m1 == 0, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 2 1
which(m2 == 0, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 1 2
which(new == 0, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 2 1
You could wrap rle in a function that *applyes it on rows and columns.
vecCheck <- function(mat, v, n) {
l <- unlist(lapply(1:2, function(x) apply(mat, x, rle)), recursive=F)
any(unlist(sapply(l, function(x) x$lengths[x$values == v] == n)))
}
new <- matrix(9, 4, 4)
vecCheck(new, 0, 2)
# [1] FALSE
new[1:2, 1 ] <- 0
vecCheck(new, 0, 2)
# [1] TRUE
I am trying to create the following matrix A for n rows and n+1 columns. n will likely be around 20 or 30, but for the purpose of the question I put it at 4 and 5.
Here is what I have so far:
N <- 5 # n+1
n <- 4 # n
columns <- list()
# first column:
columns[1] <- c(-1, 1, rep(0, N-2))
# all other columns:
for(i in N:2) {
columns[i] <- c((rep(0, N-i), 1, -2, 1, rep(0, i-3)))
}
# combine into matrix:
A <- cbind(columns)
I keep getting the following error msg:
In columns[1] <- c(-1, 1, rep(0, N - 2)) :
number of items to replace is not a multiple of replacement length
And later
"for(i in N:2) {
columns[i] <- c((rep(0, N-i),"
}
Error: unexpected '}' in "}"
I guess you can try the for loop below to create your matrix A:
N <- 5
n <- 4
A <- matrix(0,n,N)
for (i in 1:nrow(A)) {
if (i == 1) {
A[i,1:2] <- c(-1,1)
} else {
A[i,i+(-1:1)] <- c(1,-2,1)
}
}
such that
> A
[,1] [,2] [,3] [,4] [,5]
[1,] -1 1 0 0 0
[2,] 1 -2 1 0 0
[3,] 0 1 -2 1 0
[4,] 0 0 1 -2 1
Another solution is to use outer, and this method would be faster and looks more compact than the for loop approach, i.e.,
A <- `diag<-`(replace(z<-abs(outer(1:n,1:N,"-")),!z %in% c(0,1),0),
c(-1,rep(-2,length(diag(z))-1)))
I thought this would be fast compared to the loop, but when I tested on a 5000x5001 example, the loop in ThomasIsCoding's answer was about 5x faster. Go with that one!
N = 5
n = N - 1
A = matrix(0, nrow = n, ncol = N)
delta = row(A) - col(A)
diag(A) = -2
A[delta %in% c(1, -1)] = 1
A[1, 1] = -1
A
# [,1] [,2] [,3] [,4] [,5]
# [1,] -1 1 0 0 0
# [2,] 1 -2 1 0 0
# [3,] 0 1 -2 1 0
# [4,] 0 0 1 -2 1
You could use data.table::shift to shift the vector c(1, -2, 1, 0) by all increments from -1 (backwards shift / lead by 1) to n - 1 (forward shift / lagged by n - 1) and then cbind all the shifted outputs together. The first-row first-column element doesn't follow this pattern so that's fixed at the end.
library(data.table)
out <- do.call(cbind, shift(c(1, -2, 1, 0), seq(-1, n - 1), fill = 0))
out[1, 1] <- -1
out
# [,1] [,2] [,3] [,4] [,5]
# [1,] -1 1 0 0 0
# [2,] 1 -2 1 0 0
# [3,] 0 1 -2 1 0
# [4,] 0 0 1 -2 1
I want to remove those columns from a matrix M that contain at least one negative number. For example, if
M = (1 0 0 1)
(1 -1 0 2)
(2 3 4 -3)
I want M to become
M = (1 0)
(1 0)
(2 4)
How to type M <- removeNegativeColumns(M) code?
Simple way could be using sum for column for condition where value < 0 (-ve).
# Data
M <- matrix(c(1,0,0,1,1, -1, 0, 2,2, 3, 4, -3), ncol = 4, byrow = T)
M[, !colSums(M < 0 )]
# [,1] [,2]
#[1,] 1 0
#[2,] 1 0
#[3,] 2 4
M <- matrix(c(1,0,0,1,1, -1, 0, 2,2, 3, 4, -3), ncol = 4, byrow = T)
M1<- apply(M, 2,function(i)
{
p<- any(i <0)==FALSE #(any(as.vector(i)) < 0)
p
})
M<- M[,M1]
removeNegativeColumns <- function(M) M[,apply(M>=0,2,all)]
removeNegativeColumns(M)
# [,1] [,2]
# [1,] 1 0
# [2,] 1 0
# [3,] 2 4
Check whether the minimum of each row is less than zero, then use that to filter your matrix:
filter <- apply(M, 2, function (x) min(x) < 0)
M <- M[,!filter]
Edit:
As per Moody_Mudskipper this is a similar but superior (and correct) method:
filter <- apply(data, 2, function (x) any(x < 0))
data <- data[,!filter]