I have adjacency matrix as below:
> matrix(c(0,1,0,0,1,0,1,0,0,1,0,1,0,0,1,0),ncol=4,byrow=T)
[,1] [,2] [,3] [,4]
[1,] 0 1 0 0
[2,] 1 0 1 0
[3,] 0 1 0 1
[4,] 0 0 1 0
Question 1: how can I get the corresponding information like:
2 5 7 10 12 15from R?
Question 2: how can I get the location information of '1's in each row like:
2
1 3
2 4
3
or 2 1 3 2 4 3from R?
Thanks!
Just use which on a logical matrix
which(m1 == 1)
#[1] 2 5 7 10 12 15
If we need the column index in a list
sapply(split(!!m1, col(m1)), which)
Or as a vector
na.omit(na_if(c(t(m1 * col(m1))), 0))
#[1] 2 1 3 2 4 3
data
m1 <- matrix(c(0,1,0,0,1,0,1,0,0,1,0,1,0,0,1,0),ncol = 4,byrow = TRUE)
m <- matrix(c(0,1,0,0,1,0,1,0,0,1,0,1,0,0,1,0),ncol=4,byrow=T)
mm <- m == 1
which(mm)
#[1] 2 5 7 10 12 15
apply(mm, 1, which)
#[[1]]
#[1] 2
#
#[[2]]
#[1] 1 3
#
#[[3]]
#[1] 2 4
#
#[[4]]
#[1] 3
perhaps also see raster::adjacency
Related
I have a random 10x10 DF (in reality its a few million rows):
df <- replicate(10, sample(0:5, 10, rep=T))
I need to calculate a column on at end of my df that is the a count of the maximum length of consecutive values equal or over a set number e.g. 3 or more.
Therefore, a single row that contained the values: 2,4,3,3,4,5,1,0,5,1 would return a value of 5, as the set of values 4,3,3,4,5 are all 3 or more and are consecutive.
while a 5 does occur again in the row which is above 3 its consecutive occurrence is less than 5 consecutives numbers over 3 earlier in the row.
Any help appreciated.
# condition is that x should be larger or equal to 3
condition <- function(x) x >= 3
# example row
row = c(2,4,3,3,4,5,1,0,5,1)
# we can use condition on row:
condition(row)
# and we can emplay rle on that:
rle(condition(row))
# we need to filter those rle results for TRUE:
r <- rle(condition(row))
r$length[r$values == TRUE]
# The answer is the max of the latter
max(r$length[r$values])
or for your dataframe example
# condition is that x should be larger or equal to 3
condition <- \(x) x >= 3
number <- function(row, condition){
r <- row |>
condition() |>
rle()
max(r$length[r$values])
}
df <- replicate(10, sample(0:5, 10, rep=T))
apply(df, 1, number, condition)
Use rle here for run-length encoding.
vec <- c(2,4,3,3,4,5,1,0,5,1)
r <- rle(vec >= 3)
r
# Run Length Encoding
# lengths: int [1:5] 1 5 2 1 1
# values : logi [1:5] FALSE TRUE FALSE TRUE FALSE
ind <- head(which(r$values), 1)
ind
# [1] 2
r$lengths[ind]
# [1] 5
### to see what those five values are ...
r$values[-ind] <- FALSE
vec[inverse.rle(r)]
# [1] 4 3 3 4 5
That gets us the longest length within the row. To apply this row-wise to a frame,
set.seed(42)
df <- replicate(10, sample(0:5, 10, rep=T))
df
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0 0 3 1 3 5 1 3 5 3
# [2,] 4 4 4 5 4 5 4 0 1 1
# [3,] 0 5 4 2 3 1 0 2 1 1
# [4,] 0 3 4 5 1 3 0 2 0 2
# [5,] 1 1 3 1 1 2 3 4 1 4
# [6,] 3 1 1 3 2 5 4 4 4 4
# [7,] 1 2 3 3 0 4 1 3 5 5
# [8,] 1 0 2 5 4 1 0 5 4 2
# [9,] 0 0 1 1 1 5 4 4 3 5
# [10,] 3 2 0 4 1 1 3 3 0 3
func <- function(x, lim = 3) {
r <- rle(x >= lim)
ind <- head(which(r$values), 1)
if (length(ind) == 1 && !anyNA(ind)) r$lengths[ind] else 0
}
apply(df, 1, func)
# [1] 1 7 2 3 1 1 2 2 5 1
Let's say I have a vector
vec <- c(3,0,1,1,0,3,0,1,3,0,0,0,3)
And I want to be able to count through this vector using the value 3 as the refresh point. So, the output I want is
vec out
[1,] 3 1
[2,] 0 2
[3,] 1 3
[4,] 1 4
[5,] 0 5
[6,] 3 1
[7,] 0 2
[8,] 1 3
[9,] 3 1
[10,] 0 2
[11,] 0 3
[12,] 0 4
[13,] 3 1
How would I do this in R, preferably without using loops?
With base R, you can do:
ave(vec, cumsum(vec == 3), FUN = seq_along)
[1] 1 2 3 4 5 1 2 3 1 2 3 4 1
An option using data.table::rowid:
data.table::rowid(cumsum(vec==3L))
As another idea, we can locate the indices of the last value of 3 for each element of vec:
last3 = cummax((vec == 3) * seq_along(vec))
last3
# [1] 1 1 1 1 1 6 6 6 9 9 9 9 13
And subtract from their respective indices in vec:
seq_along(vec) - last3 + 1 ## `.. - pmax(last3, 1) ..` if `vec[1] != 3`
# [1] 1 2 3 4 5 1 2 3 1 2 3 4 1
Let n be a positive integer. We have a matrix B that has n columns, whose entries are integers between 1 and n. The aim is to match the rows of B with the rows of permutations(n), memorizing the indices in a vector v.
For example, let us consider the following. If
permutations(3)=
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 2
[3,] 2 1 3
[4,] 2 3 1
[5,] 3 1 2
[6,] 3 2 1
and
B=
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 2 3
[3,] 3 1 2
[4,] 2 3 1
[5,] 3 1 2
Then the vector v is
1 1 5 4 5
because the first two rows of B are equal to the row number 1 of permutations(3), the third row of B is the row number 5 of permutations(3), and so on.
I tried to apply the command
row.match
but the latter returns the error:
Error in do.call("paste", c(x[, , drop = FALSE], sep = "\r")) :
second argument must be a list
One way is to use match,
match(do.call(paste, data.frame(B)), do.call(paste, data.frame(m1)))
#[1] 1 1 5 4 5
One possible way is to turn your matrices into dataframes and join them:
A = read.table(text = "
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
")
B = read.table(text = "
1 2 3
1 2 3
3 1 2
2 3 1
3 1 2
")
library(dplyr)
A %>%
mutate(row_id = row_number()) %>%
right_join(B) %>%
pull(row_id)
# [1] 1 1 5 4 5
Okay, sorry I know this sounds unnecessarily confusing. I am basically looking to return a vector of elements equal to the number of rows with each element specifying where in the matrix the specified outcome occurred.
library(gtools)
(A <- permutations(3, 3))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 2
[3,] 2 1 3
[4,] 2 3 1
[5,] 3 1 2
[6,] 3 2 1
Using the unknown function foo would return:
foo(A, match=1)
[1] 1 1 2 3 2 3
foo(A, match=2)
[1] 2 3 1 1 3 2
Thank you for any help you can provide!
Use max.col and some indexing of the matrix:
> max.col(A==1)
[1] 1 1 2 3 2 3
> max.col(A==2)
[1] 2 3 1 1 3 2
Try
foo <- function(mat, match=1){
indx <- which(mat==match, arr.ind=TRUE)
indx[order(indx[,1]),2]
}
foo(A, 1)
#[1] 1 1 2 3 2 3
foo(A,2)
#[1] 2 3 1 1 3 2
I have a sparse matrix represented as
> (f <- data.frame(row=c(1,2,3,1,2,1,2,3,4,1,1,2),value=1:12))
row value
1 1 1
2 2 2
3 3 3
4 1 4
5 2 5
6 1 6
7 2 7
8 3 8
9 4 9
10 1 10
11 1 11
12 2 12
Here the first column is always present (in fact, the first few are present, the rest are not).
I want to get the data into the matrix format:
> t(matrix(c(1,2,3,NA,4,5,NA,NA,6,7,8,9,10,NA,NA,NA,11,12,NA,NA),nrow=4,ncol=5))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 4 5 NA NA
[3,] 6 7 8 9
[4,] 10 NA NA NA
[5,] 11 12 NA NA
Here is what seems to be working:
> library(Matrix)
> as.matrix(sparseMatrix(i = cumsum(f[[1]] == 1), j=f[[1]], x=f[[2]]))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 0
[2,] 4 5 0 0
[3,] 6 7 8 9
[4,] 10 0 0 0
[5,] 11 12 0 0
Except that I have to replace 0 with NA myself.
Is there a better solution?
You can do everything with base functions. The trick is to use indexing by a 2-col (row and col indices) matrix:
j <- f$row
i <- cumsum(j == 1)
x <- f$value
m <- matrix(NA, max(i), max(j))
m[cbind(i, j)] <- x
m
Whether it is better or not than using the Matrix package is subjective. Overkill in my opinion if you are not doing anything else with it. Also if your data had 0 in the f$value column, they would end up being converted as NA if you are not too careful.