Replace the first zero element of a row of a matrix, - r

As fast as possible, I would like to replace the first zeros in some rows of a matrix with values stored in another vector.
There is a numeric matrix where each row is a vector with some zeros.
I also have two vectors, one containing the rows, in what to be replaced, and another the new values: replace.in.these.rows and new.values. Also, I can generate the vector of first zeroes with sapply
mat <- matrix(1,5,5)
mat[c(1,8,10,14,16,22,14)] <- 0
replace.in.these.rows <- c(1,2,3)
new.values <- c(91,92,93)
corresponding.poz.of.1st.zero <- sapply(replace.in.these.rows,
function(x) which(mat [x,] == 0)[1] )
Now I would like something that iterates over the index vectors, but without a for loop possibly:
matrix[replace.in.these.rows, corresponding.poz.of.the.1st.zero ] <- new.values
Is there a trick with indexing more than simple vectors? It could not use list or array(e.g.-by-column) as index.
By default R matrices are a set of column vectors. Do I gain anything if I store the data in a transposed form? It would mean to work on columns instead of rows.
Context:
This matrix stores contact ID-s of a network. This is not an adjacency matrix n x n, rather n x max.number.of.partners (or n*=30) matrix.
The network uses edgelist by default, but I wanted to store the "all links from X" together.
I assumed, but not sure if this is more efficient than always extract the information from the edgelist (multiple times each round in a simulation)
I also assumed that this linearly growing matrix form is faster than storing the same information in a same formatted list.
Some comments on these contextual assumptions are also welcome.

Edit: If only the first zeros are to be replace then this approach works:
first0s <-apply(mat[replace.in.these.rows, ] , 1, function(x) which(x==0)[1])
mat[cbind(replace.in.these.rows, first0s)] <- new.values
> mat
[,1] [,2] [,3] [,4] [,5]
[1,] 91 1 1 0 1
[2,] 1 1 1 1 92
[3,] 1 93 1 1 1
[4,] 1 1 0 1 1
[5,] 1 0 1 1 1
Edit: I thought that the goal was to replace all zeros in the chosen rows and this was the approach. A completely vectorized approach:
idxs <- which(mat==0, arr.ind=TRUE)
# This returns that rows and columns that identify the zero elements
# idxs[,"row"] %in% replace.in.these.rows
# [1] TRUE TRUE FALSE FALSE TRUE TRUE
# That isolates the ones you want.
# idxs[ idxs[,"row"] %in% replace.in.these.rows , ]
# that shows what you will supply as the two column argument to "["
# row col
#[1,] 1 1
#[2,] 3 2
#[3,] 1 4
#[4,] 2 5
chosen.ones <- idxs[ idxs[,"row"] %in% replace.in.these.rows , ]
mat[chosen.ones] <- new.values[chosen.ones[,"row"]]
# Replace the zeros with the values chosen (and duplicated if necessary) by "row".
mat
#---------
[,1] [,2] [,3] [,4] [,5]
[1,] 91 1 1 91 1
[2,] 1 1 1 1 92
[3,] 1 93 1 1 1
[4,] 1 1 0 1 1
[5,] 1 0 1 1 1

Related

R - expand.grid with dataset having columns of different sizes

I want to produce a matrix which holds all possible combinations of a vector x of integers from 1 to the respective number.
The length of the vector x may change.
For this sample vector:
x = c(3,8,2)
I want the result to look something like this:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 2
[3,] 1 2 1
...
[48,] 3 8 2
I understand expand.grid does the job, however, I can't seem to find the parameters which allow for different sets in each column.
We get the sequence of each element (seq) and do expand.grid
out <- expand.grid(lapply(x, seq))
dim(out)
#[1] 48 3

R: how to calculate element-wise arg-min from a list of matrices?

Suppose I have a list of matrices. Suppose further I have found the smallest values by the column.
Here is my last question
I really need to know from which matrix each smallest value is selected. My original function is very complicated. Therefore, I provided a simple example. I have one idea and really do not know to implement it correctly in R.
My idea is:
Suppose that [i,j] is the elements of the matrix. Then,
if(d[[1]][i,j] < d[[2]][i,j]){
d[[1]][i,j] <– "x"
}else { d[[2]][i,j] <– "z"}
So, I would like to sign the name of the matrix that corresponds to each smallest value. Then, store the names in a separate matrix. So, then I can see the values in one matrix and their corresponding names (from where they come from) in another matrix
For example,
y <- c(3,2,4,5,6, 4,5,5,6,7)
x[lower.tri(x,diag=F)] <- y
> x
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 3 0 0 0 0
[3,] 2 6 0 0 0
[4,] 4 4 5 0 0
[5,] 5 5 6 7 0
k <- c(1,4,5,2,5,-4,4,4,4,5)
z[lower.tri(z,diag=F)] <- k
> z
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 1 0 0 0 0
[3,] 4 5 0 0 0
[4,] 5 -4 4 0 0
[5,] 2 4 4 5 0
d <- list(z, x)
Then:
do.call(pmin, d) (answered by #akrun)
Then, I will only get the matrix with smallest values. I would like to know where each value is come from?
Any idea or help, please?
You can use Map and do.call to create your own functions that will be applied element-wise to a list of inputs,
in your case a list of matrices.
pwhich.min <- function(...) {
which.min(c(...)) # which.min takes a single vector as input
}
di <- unlist(do.call(Map, c(list(f = pwhich.min), d)))
dim(di) <- dim(x) # take dimension from one of the inputs
di
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 1 1 1 1
[3,] 1 2 1 1 1
[4,] 1 2 2 1 1
[5,] 2 2 2 2 1
EDIT:
To elaborate,
you could do something like Map(f = min, z, x) to apply min to each pair of values in z and x,
although in that case min already supports arbitrary amount of inputs through an ellipsis (...).
By contrast,
which.min only takes a single vector as input,
so you need a wrapper with an ellipsis that combines all values into a vector
(pwhich.min above).
Since you may want to have more than two matrices,
you can put them all in a list,
and use do.call to put each element in the list as a parameter to the function you specify in f.
Or another option would be to convert it to a 3D array and use apply with which.min
apply(array(unlist(d), c(5, 5, 2)), c(1, 2), which.min)
Or with pmap from purrr
library(purrr)
pmap_int(d, ~ which.min(c(...))) %>%
array(., dim(x))

R: Remove rows with fewer than certain threshold non-zero values

I would like to know how to remove rows from a data frame that have fewer than (let's say 5) non-zero entries.
The closest I've come is:
length(which(df[1,] > 0)) >= 5
but how to apply this to the whole data frame and drop the ones that are FALSE? Is there a function similar to the COUNTIF() function in excel that I can apply here?
Thank you for your help.
You can use boolean values in rowSums and in [:
df[ rowSums(df > 0) >= 5, ]
There are 3 steps hidden in this expression:
expression df > 0 produces a matrix with values TRUE where element > 0
Function rowSums returns number of nonzero elements for every line (when summing it treats values TRUE as 1 and FALSE as 0)
finally [ selects only lines where the number of non-zero elements >= 5
You can also use a for-loop.
We first create a matrix of zero's and one's to test our code. Row 2 has to be excluded because it has less than 5 non-zero values.
In the loop we count the number of non-zero values per row, and assign TRUE if this is less than 5 (FALSE otherwise). The vector named 'drop' holds the information for which row is TRUE then FALSE. In the final step, we exclude those rows for which drop==TRUE.
mat <- matrix(c(1,1,1,1,0,1,1,1,1,1,1,1,1,1,1), nrow=3, ncol=5)
mat
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 0 1 1 1
[3,] 1 1 1 1 1
drop <- NULL
for(i in 1:NROW(mat)){
count.non.zero <- sum(mat[i,]!=0, na.rm=TRUE)
drop <- c(drop, count.non.zero<5)
}
mat[!drop==TRUE,]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 1 1 1 1
NOTE: na.rm==TRUE allows this script to work when your data contains missing values.

Delete specific values in a matrix according to two position vectors

My aim is to delete specific positions in a matrix according to a vector. Just giving you a small example.
Users_pos <- c(1,2)
Items_pos <- c(3,2)
Given a Matrix A:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
My aim according to the two Vectors User_pos and Item_pos is to delete the following values
A[1,3] and A[3,2]
I'm wondering if there's a possibility to do so without typing in the values for rows and columns by hand.
You can index k elements in a matrix A using A[X], where X is a k-row, 2-column matrix where each row is the (row, col) value of the indicated element. Therefore, you can index your two elements in A with the following indexing matrix:
rbind(Users_pos, Items_pos)
# [,1] [,2]
# Users_pos 1 2
# Items_pos 3 2
Using this indexing, you could choose to extract the information current stored with A[X] or replace those elements with A[X] <- new.values. If you, for instance, wanted to replace these elements with NA, you could do:
A[rbind(Users_pos, Items_pos)] <- NA
A
# [,1] [,2] [,3]
# [1,] 1 NA 3
# [2,] 4 5 6
# [3,] 7 NA 9

Efficient creation of tridiagonal matrices

How can I create a quadratic band matrix, where I give the diagonal and the first diagonal below and above the diagonal? I am looking for a function like
tridiag(upper, lower, main)
where length(upper)==length(lower)==length(main)-1 and returns, for example,
tridiag(1:3, 2:4, 3:6)
[,1] [,2] [,3] [,4]
[1,] 3 1 0 0
[2,] 2 4 2 0
[3,] 0 3 5 3
[4,] 0 0 4 6
Is there an efficient way to do it?
This function will do what you want:
tridiag <- function(upper, lower, main){
out <- matrix(0,length(main),length(main))
diag(out) <- main
indx <- seq.int(length(upper))
out[cbind(indx+1,indx)] <- lower
out[cbind(indx,indx+1)] <- upper
return(out)
}
Note that when the index to a matrix is a 2 column matrix, each row in that index is interpreted as the row and column index for a single value in the vector being assigned.

Resources