Comparing rows of an R matrix with a predefined vector - r

I have made a matrix with values 1 and 0, and I want to check if there is one or more rows identical to (0, 0, 0, 0, 0, 0, 0, 0, 0, 0).
How can I do this?
Here's my code so far for making the matrix:
moeda <- c(0, 1)
n <- 100
casosTotais <- 0
casosFav <- 0
caras <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0) ## the vector to compare with
matriz <- matrix(nrow = n, ncol = 10)
i <- 1
lin <- 1
col <- 1
while(i <= n * 10){
matriz[lin, col] <- sample(moeda,1)
if(col==10){
lin <- lin + 1
col <- col - 10
}
i <- i + 1
col <- col + 1
}
matriz

I will first assume a general caras with zeros and ones:
## a vector of TRUE/FALSE; TRUE means a row of `matriz` is identical to `caras`
comp <- colSums(abs(t(matriz) - caras)) == 0
Then if caras is a simply a vector of zeros:
## a vector of TRUE/FALSE; TRUE means a row of `matriz` only contains zeros
comp <- rowSums(matriz) == 0
If you want to summarize the comparison:
To know which rows of matriz are identical to caras, do which(comp).
To know if any row of matriz is identical to caras, do any(comp).
To know how many rows of matriz is identical to caras, do sum(comp).
Note: You can generate this random matrix using:
## an n x 10 random matrix of zeros and ones
matriz <- matrix(rbinom(n * 10, size = 1, prob = 0.5), ncol = 10)

Related

How to convert binary output to values in relation to a column in r

The sample data is as follows
ID <- c(1, 2, 3)
O1D1 <- c(0, 0, 0)
O1D2 <- c(0, 0, 0)
O1D3 <- c(0, 10, 0)
O2D1 <- c(0, 0, 0)
O2D2 <- c(0, 0, 0)
O2D3 <- c(18, 0, 17)
O3D1 <- c(0, 9, 0)
O3D2 <- c(20, 1, 22)
O3D3 <- c(0, 0, 0)
x <- data.frame(ID, O1D1, O1D2, O1D3, O2D1, O2D2, O2D3, O3D1, O3D2, O3D3)
I created a new column with some conditional logic.
Say, the new column is n
x$n <- (x$O1D3 > 0 & x$O2D3 == 0)
> x$n
[1] FALSE TRUE FALSE
What I am looking to get instead is a column with values such as
> x$n
[1] 0 10 0
Or, in other words, the values of O1D3 should replace TRUE values in the n column and the FALSE values can be replaced with 0.
Thanks for your time and help.

replacing specific positional value in each matrix within a list, with sequential values from a vector in r

I am attempting to replace a specific value in my list of matrices with each sequential value in a vector called one.to.two.s. This vector comprises a sequence of numbers running from 0.4 to 0.89 with steps of 0.01. From the code below, I would like to replace the value 2 in all matrices in the list by each consecutive value of one.to.two.s: the value 2 in the first matrix is replaced by the first value of one.to.two.s, the value 2 in the second matrix is replaced by the second value of one.to.two.s and so forth.
As an extension, I would like to be able repeat the one.to.two.s sequence if the vector had say length 50 and the list was say length 100. Below, I have a for loop which doesn't work, but I believe this could be handled with lapply somehow.
A <- lapply(1:50, function(x) # construct list of matrices
matrix(c(0, 0, 0, 0,
2, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 1), nrow = 4,ncol=4, byrow = TRUE))
Anew <-A
one.to.two.s <- c(seq(from = 0.40, to = 0.89,by=0.01))
for(t in 1:length(Anew)) {
Anew[[t]][2,1] <- one.to.two.s
}
Using an example one.to.two.s which is shorter than length(A), you could use rep with length.out to make it the correct length, and then Map over that vector and A to create Anew
one.to.two.s <- seq(from = 0.4, to = 0.8, by = 0.01)
Anew <- Map(function(A, x) {
A[2, 1] <- x
A
}, A, rep(one.to.two.s, length.out = length(A)))
Created on 2022-01-27 by the reprex package (v2.0.1)
You can try the following for loop if you have longer list than the vector
for(t in 1:length(Anew)) {
Anew[[t]][2,1] <- one.to.two.s[(t-1)%%length(one.to.two.s)+1]
}
I forgot to add [t] to the end of my replacement as well. Also can repeat a vector ahead of time.
for(t in 1:length(Anew)) {
Anew[[t]][2,1] <- one.to.two.s
}
instead becomes
for(t in 1:length(Anew)) {
Anew[[t]][2,1] <- one.to.two.s[t]
}
I believe this is what you are looking for. In this example, the list consists of 105 matrices.
# use replicate() instead of lapply()
B <- 50L
A <- replicate(B*2.1,
matrix(c(0, 0, 0, 0,
2, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 1), nrow = 4,ncol=4, byrow = TRUE),
simplify = FALSE)
Anew <- A
one.to.two.s <- seq(from = 0.40, to = 0.89, by = 0.01)
# loop over all elements in Anew
for (t in seq_along(Anew)) {
Anew[[t]][2,1] <- one.to.two.s[
seq_len(length(Anew) + 2L) %% (length(one.to.two.s) + 1L)
][t]
}
# > head(sapply(Anew, '[', 2))
# [1] 0.40 0.41 0.42 0.43 0.44 0.45
# > tail(sapply(Anew, '[', 2))
# [1] 0.89 0.40 0.41 0.42 0.43 0.44

Assign value matrix based on index condition

How can I assign a value into a matrix based in a vector condition index. A working example is:
# Input:
r <- c(2, 1, 3)
m <- matrix(rep(0, 9), nrow = 3)
# Desired output
result <- matrix(c(0, 1, 0,
1, 0, 0,
0, 1, 0), nrow = 3)
result.
# I try with this notation but it does not work:
sapply(1:3, function(x)m[x, r[x]] <- 1)
We use row/column indexing to assign
m[cbind(seq_len(nrow(m)), r)] <- 1
Or using replace
replace(m, cbind(seq_len(nrow(m)), r), 1)

Vectorization of conditional distributions for pairs columns in a matrix

Given two vectors of integers:
X <- c(0, 201, 0, 0, 160, 0, 0, 0, 15, 80)
Y <- c(0, 0, 0, 0, 1, 4, 42, 10, 19, 0)
I want to calculate the probability p1 = P(X10 > X11), where X10 is a variable with a conditional distribution of X given that Y = 0, and X11 is a variable with a conditional distribution of X given that Y > 0. (This problem is motivated by a desire to implement equation 8 from RS Pimentel et al. 2015, Stat Prob Lett 96:61-67.)
For two pairs of vectors, I can simply calculate:
N <- length(X)
X10 <- X
X10[Y > 0] <- 0
X11 <- X
X11[Y == 0] <- 0
p1 <- sum(X10 > X11) / N
However, I now want to calculate p1 for all pairs of columns in an integer matrix:
Z <- c(0, 0, 0, 0, 0, 1, 0, 1, 8, 0)
matrix(c(X, Y, Z), ncol = 3)
I am not interested in the diagonal.
The desired output is therefore:
[,1] [,2] [,3]
[1,] 0.2 0.3
[2,] 0.2
[3,]
How can I write a function that will calculate p1 for all pairs of columns in the matrix?
You can create a custom function to compute your probability, then apply it to each combination of columns:
p1 <- function(x, y) {
x10 <- x
x10[y > 0] <- 0
x11 <- x
x11[y == 0] <- 0
mean(x10 > x11)
}
combinations <- t(combn(ncol(M), 2))
# create a matrix of NAs, fill the appropriate values
result <- matrix(NA, nrow = ncol(M), ncol = ncol(M))
result[combinations] <- apply(combinations, 1, function(r) p1(M[, r[1]], M[, r[2]]))

how to find the longest same number in R

for example I have data like this
x<-c(0,0,1,1,1,1,0,0,1,1,0,1,1,1)
I want find the longest sequence of "1" by considering the start and end position, in this case should be (3,6)
How to do this in R
thanks all
Here's an approach that uses seqle from the "cgwtools" package:
library(cgwtools)
y <- seqle(which(x == 1))
z <- which.max(y$lengths)
y$values[z] + (sequence(y$lengths[z]) - 1)
# [1] 3 4 5 6
You can use range if you just wanted the "3" and "6".
seqle "extends rle to find and encode linear sequences".
Here's the answer as a function:
longSeq <- function(invec, range = TRUE) {
require(cgwtools)
y <- seqle(which(invec == 1))
z <- which.max(y$lengths)
out <- y$values[z] + (sequence(y$lengths[z]) - 1)
if (isTRUE(range)) range(out) else out
}
Usage would be:
x <- c(0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1)
longSeq(x)
# [1] 3 6
longSeq(x, range = FALSE)
# [1] 3 4 5 6
And, with KFB's example input:
y <- c(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1)
longSeq(y)
# [1] 9 11
You can do this easily with base R too using rle and inverse.rle combination
Creating the funciton
longSeq2 <- function(x, range = TRUE){
temp <- rle(x == 1)
temp$values <- temp$lengths == max(temp$lengths[temp$values == TRUE])
temp <- which(inverse.rle(temp))
if (isTRUE(range)) range(temp) else temp
}
Testing
x <- c(0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,0,1,1,1)
longSeq2(x)
## [1] 3 6
longSeq2(x, range = FALSE)
## [1] 3 4 5 6
y <- c(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1)
longSeq2(y)
## [1] 9 11
longSeq2(y, range = FALSE)
## [1] 9 10 11

Resources