Hi I was looking for some help on trying to understand what does a probability matrix achieve when sampling, I am having a hard time wrapping my head around the prob[a, b] does, to be honest the syntax here seems a bit different than in other languages, first because we pass indexes to a matrix and it construct a bigger one (that is kinda cool) but I digress I currently have the following bernoulli sampling:
N <- 8
prob <- matrix(c(0.1,0.1,0.5,0.8), nrow=2)
a <- sample(1:2, size=N, replace=TRUE)
b <- rbern(N, ifelse(l==1, 0.5, .1)) + 1
rbern(N, prob = prob[a, b])
What I am unable to understand well is when sampling I give a matrix of 8x8, not sure which probability is going to use to sample if I am only asking for 8 observations.
It will simply take the first N values from the matrix used in the prob argument (starting with the first column).
Consider the following code.
N <- 8
m <- matrix(sample(0:1, N^2, 1), N, N)
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 0 1 0 1 1 0 0 1
#> [2,] 1 1 1 0 0 0 1 1
#> [3,] 0 1 0 1 0 1 1 1
#> [4,] 1 0 0 1 0 0 0 1
#> [5,] 0 1 0 0 0 0 1 0
#> [6,] 0 1 1 1 0 1 1 0
#> [7,] 1 0 0 0 0 1 1 0
#> [8,] 0 0 1 1 1 1 0 0
rbinom(N, 1, prob = m)
#> [1] 0 1 0 1 0 0 1 0
Only the first N values of m are used for probabilities, so the result of rbinom(N, 1, prob = m) is the same as the first column of m (since all probabilities are either 0 or 1).
Related
I would like to transform a vector of integer such:
vector = c(0,6,1,8,5,4,2)
length(vector) = 7
max(vector) = 8
into a matrix m of nrow = length(vector) and ncol = max(vector) :
m =
0 0 0 0 0 0 0 0
1 1 1 1 1 1 0 0
1 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
1 1 1 1 1 0 0 0
1 1 1 1 0 0 0 0
1 1 0 0 0 0 0 0
It's just an example of what I am trying to do. I intend that the function work with every vector of integer.
I tried to used the function mapply(rep, 1, vector) but I obtained a list and I didn't succeed to convert it into a matrix...
It would be very useful for me if someone can help me.
Best Regards,
Maxime
If you use c(rep(1, x), rep(0, max(vector-x)) on each element of your variable vector you get the desired binary results. Looping that with sapply even returns a matrix. You only need to transpose it afterwards and you get your result.
vector = c(0,6,1,8,5,4,2)
result <- t(sapply(vector, function(x) c(rep(1, x), rep(0, max(vector)-x))))
is.matrix(result)
#> [1] TRUE
result
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 0 0 0 0 0 0 0 0
#> [2,] 1 1 1 1 1 1 0 0
#> [3,] 1 0 0 0 0 0 0 0
#> [4,] 1 1 1 1 1 1 1 1
#> [5,] 1 1 1 1 1 0 0 0
#> [6,] 1 1 1 1 0 0 0 0
#> [7,] 1 1 0 0 0 0 0 0
Putting that into a function is easy:
binaryMatrix <- function(v) {
t(sapply(v, function(x) c(rep(1, x), rep(0, max(v)-x))))
}
binaryMatrix(vector)
# same result as before
Created on 2021-02-14 by the reprex package (v1.0.0)
Another straightforward approach would be to exploit matrix sub-assignment using row/column indices in a matrix form (see, also, ?Extract).
Define a matrix of 0s:
x = c(0, 6, 1, 8, 5, 4, 2)
m = matrix(0L, nrow = length(x), ncol = max(x))
And fill with 1s:
i = rep(seq_along(x), x) ## row indices of 1s
j = sequence(x) ## column indices of 1s
ij = cbind(i, j)
m[ij] = 1L
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,] 0 0 0 0 0 0 0 0
#[2,] 1 1 1 1 1 1 0 0
#[3,] 1 0 0 0 0 0 0 0
#[4,] 1 1 1 1 1 1 1 1
#[5,] 1 1 1 1 1 0 0 0
#[6,] 1 1 1 1 0 0 0 0
#[7,] 1 1 0 0 0 0 0 0
Assuming that all values in the vector are non-negative integers, you can define the following function
transformVectorToMatrix <- function(v) {
nrOfCols <- max(v)
zeroRow <- integer(nrOfCols)
do.call("rbind",lapply(v,function(nrOfOnes) {
if(nrOfOnes==0) return(zeroRow)
if(nrOfOnes==nrOfCols) return(zeroRow+1)
c(integer(nrOfOnes)+1,integer(nrOfCols-nrOfOnes))
}))
}
and finally do
m = transformVectorToMatrix(vector)
to get your desired binary matrix.
I have a binary vector that holds information on whether or not some event happened for some observation:
v <- c(0,1,1,0)
What I want to achieve is a matrix that holds information on all bivariate pairs of observations in this vector. That is, if two observations both have 0 or both have 1 in this vector v, they should get a 1 in the matrix. If one has 0 and the other has 1, they should get a 0 otherwise.
Hence, the goal is this matrix:
[,1] [,2] [,3] [,4]
[1,] 0 0 0 1
[2,] 0 0 1 0
[3,] 0 1 0 0
[4,] 1 0 0 0
Whether the main diagonal is 0 or 1 does not matter for me.
Is there an efficient and simple way to achieve this that does not require a combination of if statements and for loops? v might be of considerable size.
Thanks!
We can use outer
out <- outer(v, v, `==`)
diag(out) <- 0L # as you don't want to compare each element to itself
out
# [,1] [,2] [,3] [,4]
#[1,] 0 0 0 1
#[2,] 0 0 1 0
#[3,] 0 1 0 0
#[4,] 1 0 0 0
Another option with expand.grid is to create pairwise combinations of v with itself and since you have values of only 0 and 1, we can find values with 0 and 2. (0 + 0 and 1 + 1).
inds <- rowSums(expand.grid(v, v))
matrix(+(inds == 0 | inds == 2), nrow = length(v))
# [,1] [,2] [,3] [,4]
#[1,] 1 0 0 1
#[2,] 0 1 1 0
#[3,] 0 1 1 0
#[4,] 1 0 0 1
Since, the diagonal element are not important for you, I will keep it as it is or if you want to change you can use diag as shown in #markus's answer.
Another (slightly less efficient) approach than the use of outer would be sapply:
out <- sapply(v, function(x){
x == v
})
diag(out) <- 0L
out
[,1] [,2] [,3] [,4]
[1,] 0 0 0 1
[2,] 0 0 1 0
[3,] 0 1 0 0
[4,] 1 0 0 0
microbenchmark on a vector of length 1000:
> test <- microbenchmark("LAP" = sapply(v, function(x){
+ x == v
+ }),
+ "markus" = outer(v, v, `==`), times = 1000, unit = "ms")
> test
Unit: milliseconds
expr min lq mean median uq max neval
LAP 3.973111 4.065555 5.747905 4.573002 6.324607 101.03498 1000
markus 3.515725 3.535067 4.852606 3.694924 4.908930 84.85184 1000
If you allow the main diagonal to be 1, then there will always be two unique rows v and 1 - v in this matrix no matter how large v is. Since the matrix is symmetric, it also has two such unique columns. This makes it trivial to construct this matrix.
## example `v`
set.seed(0)
v <- sample.int(2, 10, replace = TRUE) - 1L
#[1] 1 0 0 1 1 0 1 1 1 1
## column expansion from unique columns
cbind(v, 1 - v, deparse.level = 0L)[, 2 - v]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 0 0 1 1 0 1 1 1 1
# [2,] 0 1 1 0 0 1 0 0 0 0
# [3,] 0 1 1 0 0 1 0 0 0 0
# [4,] 1 0 0 1 1 0 1 1 1 1
# [5,] 1 0 0 1 1 0 1 1 1 1
# [6,] 0 1 1 0 0 1 0 0 0 0
# [7,] 1 0 0 1 1 0 1 1 1 1
# [8,] 1 0 0 1 1 0 1 1 1 1
# [9,] 1 0 0 1 1 0 1 1 1 1
#[10,] 1 0 0 1 1 0 1 1 1 1
What is the purpose of this matrix?
If there are n0 zeros and n1 ones, the matrix will have dimension (n0 + n1) x (n0 + n1), but there are only (n0 x n0 + n1 x n1) ones in the matrix. So for long vector v, the matrix is sparse. In fact, it has super sparsity, as it has large number of duplicated rows / columns.
Obviously, if you want to store the position of 1 in this matrix, you can simply get it without forming this matrix at all.
I am new to programming and I am trying to figure it how can I make a matrix with all zeros and insert just a random one?
I've looked for help but I can only find code to create a random matrix with zeros and ones but I only want a "one" to appear at random places in a matrix.
I've looked in here for example,
http://www.r-bloggers.com/making-matrices-with-zeros-and-ones/
set.seed(1)
mm <- matrix(0, 10, 5)
apply(mm, c(1, 2), function(x) sample(c(0, 1), 1))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0 0 1 0 1
# [2,] 0 0 0 1 1
# [3,] 1 1 1 0 1
# [4,] 1 0 0 0 1
# [5,] 0 1 0 1 1
# [6,] 1 0 0 1 1
# [7,] 1 1 0 1 0
# [8,] 1 1 0 0 0
# [9,] 1 0 1 1 1
# [10,] 0 1 0 0 1
Creating all-zeros matrix is easy
X <- matrix(0, 10, 10)
now notice that matrix in R is stored as a vector with additional dimension
> str(X)
num [1:10, 1:10] 0 0 0 0 0 0 0 0 0 0 ...
so if you want to insert 1 on a random position, than just pick a random position in vector of length N*M and replace it with the value
X[sample(10*10, 1)] <- 1
I am trying to create random binary square matrices. However, there are some constraints. I would like the diagonal to = 0. Also, the upper and lower triangles need to be inverse transpositions of each other.
To be clear, what I am looking for would look the below for a random example 5 x 5 matrix. If you look at any row/column pair e.g. 3&5, 1&4, the upper and lower triangles for those pairs have opposite results.
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 1 0
[2,] 1 0 0 0 0
[3,] 1 1 0 1 0
[4,] 0 1 0 0 1
[5,] 1 1 1 0 0
I am running into some problems in making my random matrices asymmetric.
Here's what I have thus far for creating a random binary 12x12 matrix:
function1 <- function(m, n) {
matrix(sample(0:1, m * n, replace = TRUE), m, n)
}
A<-function1(12,12)
A #check the matrix
diag(A)<-0
My attempt at putting the transposed upper triangle into the lower triangle:
A[lower.tri(A)] <- t(A[upper.tri(A)])
A #rechecking the matrix - doesn't seem to do it.
I have tried some variations to see if I got my upper/lower triangles mixed up, but none seem to work.
Hope this question is understandable.
fun <- function(n){
vals <- sample(0:1, n*(n-1)/2, rep = T)
mat <- matrix(0, n, n)
mat[upper.tri(mat)] <- vals
mat[lower.tri(mat)] <- 1 - t(mat)[lower.tri(mat)]
mat
}
And testing it out...
> fun(5)
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 1 0 1
[2,] 1 0 1 0 1
[3,] 0 0 0 0 0
[4,] 1 1 1 0 1
[5,] 0 0 1 0 0
> out <- fun(5)
> out + t(out)
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 1 1 1
[2,] 1 0 1 1 1
[3,] 1 1 0 1 1
[4,] 1 1 1 0 1
[5,] 1 1 1 1 0
Suppose I have a vector containing data:
c <- c(1:100)
c[1:75] <- 0
c[76:100] <- 1
What I need to do is select a number of the 0's and turn them into 1's. There are potentially many ways to do this - like if I'm switching 25 of the 0's, it'd be 75 choose 25, so 5.26x10^19 - so I need do it, say, 1000 times randomly. (this is part of a larger model. I'll be using the mean of the results.)
I know (think), that I need to use sample() and a for loop - but how do I select n values randomly among the 0's, then change them to 1's?
vec <- c(rep(0, 75), rep(1, 25))
n <- 25
to_change <- sample(which(vec == 0), n)
modified_vec <- vec
modified_vec[to_change] <- 1
Something like this. You could wrap it up in a function.
And you should really do it in a matrix with apply, rather than a for loop.
This small example is easy to see it work:
n_vecs <- 5
vec_length <- 10
n_0 <- 7 # Number of 0's at the start of each vector
vec_mat <- matrix(c(rep(0, n_vecs * n_0), rep(1, n_vecs * (vec_length - n_0))),
nrow = vec_length, ncol = n_vecs, byrow = T)
> vec_mat
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[6,] 0 0 0 0 0
[7,] 0 0 0 0 0
[8,] 1 1 1 1 1
[9,] 1 1 1 1 1
[10,] 1 1 1 1 1
change_n_0 <- function(x, n) {
x_change <- sample(which(x == 0), n)
x[x_change] <- 1
return(x)
}
vec_mat <- apply(vec_mat, MARGIN = 2, FUN = change_n_0, n = 2)
> vec_mat
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 0 0 1
[2,] 0 0 0 1 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 1 0 1
[6,] 0 1 0 1 0
[7,] 1 0 1 0 0
[8,] 1 1 1 1 1
[9,] 1 1 1 1 1
[10,] 1 1 1 1 1
You can scale up the constants at the beginning as big as you'd like.