I'm trying to figure out how to get all the diagonals of a matrix.
For example, say I have the following matrix:
A <- matrix(1:16,4)
using the diag(A) function will return
[1] 1 6 11 16
In addition to the primary diagonal, I would like a list of all the diagonals above and below it.
5 10 15
2 7 12
9 14
3 8
4
13
I found the following link https://stackoverflow.com/a/13049722 which gives me the diagonals directly above and below the primary one, however I cannot seem to figure out how to extend the code to get the rest of them for any size matrix. I tried two nested for loops since it appears that some kind of incrementing of the matrix subscripts would produce the result I am looking for. I tried using ncol(A), nrow(A) in the for loops, but couldn't seem to figure out the right combination. Plus I am aware that for loops are generally frowned upon in R.
The code given was:
diag(A[-4,-1])
diag(A[-1,-4])
which returned the two diagonals, both upper and lower
Of course this is a square matrix and not all of the matrices I want to perform this on will be square. Filling in the non-square area with NAs would be acceptable if necessary. The answer I need may be in one of the other answers on the page, but the original question involved means, sums, etc. which added a layer of complexity beyond what I am trying to do. I have a feeling the solution to this will be ridiculously simple, but it just isn't occurring to me. I'm also surprised I was not able to find this question anywhere on SO, it would seem to be a common enough question. Maybe I don't know the proper terminology for this problem.
A <- matrix(1:16, 4)
# create an indicator for all diagonals in the matrix
d <- row(A) - col(A)
# use split to group on these values
split(A, d)
#
# $`-3`
# [1] 13
#
# $`-2`
# [1] 9 14
#
# $`-1`
# [1] 5 10 15
#
# $`0`
# [1] 1 6 11 16
#
# $`1`
# [1] 2 7 12
#
# $`2`
# [1] 3 8
#
# $`3`
# [1] 4
Since you're dealing with square matrices, it should be really easy to convert Gavin's answer into a small function that first calculates the range that should be used as the offset values. Here's such a function:
AllDiags <- function(inmat, sorted = TRUE) {
Range <- ncol(inmat) - 1
Range <- -Range:Range
if (isTRUE(sorted)) Range <- Range[order(abs(Range))]
lapply(Range, function(x) {
inmat[row(inmat) == (col(inmat) - x)]
})
}
Here's the output on your sample matrix "A".
AllDiags(A)
# [[1]]
# [1] 1 6 11 16
#
# [[2]]
# [1] 2 7 12
#
# [[3]]
# [1] 5 10 15
#
# [[4]]
# [1] 3 8
#
# [[5]]
# [1] 9 14
#
# [[6]]
# [1] 4
#
# [[7]]
# [1] 13
Here is one solution based on an observation that you can get all the diagonals by shrinking and expanding the matrix. That is first consider row N col 1 (get diag of that) then rows (N-1): and cols (1:2). Get diagonal of that. etc..
N <- ncol(A)
rows <- cbind(c(N:1, rep(1,N-1)), c(rep(N,N), (N-1):1)) # row indeces
cols <- apply(rows, 2, rev) # col indeces
diagMatSubset <- function(mat, i1, i2, j1, j2) diag(mat[i1:i2, j1:j2, drop=FALSE])
Map(diagMatSubset, list(A), rows[,1], rows[,2], cols[,1], cols[,2])
[[1]]
[1] 4
[[2]]
[1] 3 8
[[3]]
[1] 2 7 12
[[4]]
[1] 1 6 11 16
[[5]]
[1] 5 10 15
[[6]]
[1] 9 14
[[7]]
[1] 13
Related
This question already has answers here:
Create grouping variable for consecutive sequences and split vector
(5 answers)
Closed 6 months ago.
I am curious if there is a functional way to break up a vector of contiguous groups in R.
For example if I have the following
# start with this
vec <- c(1,2,3,4,6,7,8,30,31,32)
# end with this
vec_vec <- list(c(1, 2, 3, 4),c(6,7,8),c(30,31,32))
print(vec_vec[1])
I can only imaging this is possible with a for loop where I calculate differences of previous values, but maybe there are better ways to solve this.
split(vec, vec - seq_along(vec)) |> unname()
# [[1]]
# [1] 1 2 3 4
#
# [[2]]
# [1] 6 7 8
#
# [[3]]
# [1] 30 31 32
Subtratcing seq_along(vec) turns contiguous sequences into single values, which we can split by.
1) The collapse package has seqid for this:
library(collapse)
split(vec, seqid(vec))
giving:
$`1`
[1] 1 2 3 4
$`2`
[1] 6 7 8
$`3`
[1] 30 31 32
2) A base approach giving the same answer is
split(vec, cumsum(c(TRUE, diff(vec) != 1)))
3) In the comments seqle in cgwtools is suggested.
library(cgwtools)
lens <- seqle(vec)$lengths
split(vec, rep(seq_along(lens), lens))
Suppose we have list l which is a list of lists.
a <- list(c(1,2,3))
b <- list(c(4,5,6), c(7,8,9))
c <- list(c(10,11,12), c(13,14,15), c(16,17,18))
l <- list(a, b, c)
So l is a list of lists, where each of those lists itself contains at least one list.
Question
How can I make a function which can extract all the lowest level lists into a single list of lists?
# Goal
list(c(1,2,3), c(4,5,6), c(7,8,9), c(10,11,12), c(13,14,15), c(16,17,18))
Notes:
The example is a minimal reproducible example for which it would be possible to hard-code a solution, but it is important that the solution generalise, since the real problem has tens-of-thousands of lists each containing unknown numbers of lists - so a hard-coded solution definitely won't scale!
I hope to find a solution in base R, if possible.
A few things I've tried so far
Some unsuccessful attempts, mostly using sapply() and unlist():
sapply(l, function(x) { unlist(x) })
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5 6 7 8 9
#
# [[3]]
# [1] 10 11 12 13 14 15 16 17 18
unlist(a)
# [1] 1 2 3
# unlist() seems to combine the elements of multiple lists into one list (not desired here)..
unlist(b)
# [1] 4 5 6 7 8 9
I thought the recursive = FALSE argument looked promising, but unfortunately I couldn't get it to do what I wanted either:
unlist(b, recursive = FALSE)
# [1] 4 5 6 7 8 9
Use unlist, non-recursive on your initial list.
unlist(l, recursive=FALSE)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5 6
#
# [[3]]
# [1] 7 8 9
#
# [[4]]
# [1] 10 11 12
#
# [[5]]
# [1] 13 14 15
#
# [[6]]
# [1] 16 17 18
Just use unlist():
a <- list(c(1,2,3))
b <- list(c(4,5,6), c(7,8,9))
c <- list(c(10,11,12), c(13,14,15), c(16,17,18))
l <- list(a, b, c)
l.want <- list(c(1,2,3), c(4,5,6), c(7,8,9), c(10,11,12), c(13,14,15), c(16,17,18))
#Use unlist
l.func <- unlist(l, recursive = F)
all.equal(l.want, l.func)
# TRUE
I have a vector of integers like this:
a <- c(2,3,4,1,2,1,3,5,6,3,2)
values<-c(1,2,3,4,5,6)
I want to list, for every unique value in my vector (the unique values being ordered), the position of their occurences. My desired output:
rep_indx<-data.frame(c(4,6),c(1,5,11),c(2,7,10),c(3),c(8),c(9))
split fits pretty well here, which returns a list of indexes for each unique value in a:
indList <- split(seq_along(a), a)
indList
# $`1`
# [1] 4 6
#
# $`2`
# [1] 1 5 11
#
# $`3`
# [1] 2 7 10
#
# $`4`
# [1] 3
#
# $`5`
# [1] 8
#
# $`6`
# [1] 9
And you can access the index by passing the value as a character, i.e.:
indList[["1"]]
# [1] 4 6
You can do this, using sapply. The ordering that you need is ensured by the sort function.
sapply(sort(unique(a)), function(x) which(a %in% x))
#### [[1]]
#### [1] 4 6
####
#### [[2]]
#### [1] 1 5 11
#### ...
It will result in a list, giving the indices of your repetitions. It can't be a data.frame because a data.frame needs to have columns of same lengths.
sort(unique(a)) is exactly your vector variable.
NOTE: you can also use lapply to force the output to be a list. With sapply, you get a list except if by chance the number of replicates is always the same, then the output will be a matrix... so, your choice!
Perhaps this also works
order(match(a, values))
#[1] 4 6 1 5 11 2 7 10 3 8 9
You can use the lapply function to return a list with the indexes.
lapply(values, function (x) which(a == x))
a questions from a relative n00b: I’d like to split a vector into three vectors of different lengths, with the values assigned to each vector at random. For example, I’d like to split the vector of length 12 below into vectors of length 2,3, and 7
I can get three equal sized vectors using this:
test<-1:12
split(test,sample(1:3))
Any suggestions on how to split test into vectors of 2,3, and 7 instead of three vectors of length 4?
You could use rep to create the indices for each group and then split based on that
split(1:12, rep(1:3, c(2, 3, 7)))
If you wanted the items to be randomly assigned so that it's not just the first 2 items in the first vector, the next 3 items in the second vector, ..., you could just add call to sample
split(1:12, sample(rep(1:3, c(2, 3, 7))))
If you don't have the specific lengths (2,3,7) in mind but just don't want it to be equal length vectors every time then SimonO101's answer is the way to go.
How about using sample slightly differently...
set.seed(123)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )
#$`1`
#[1] 1 6
#$`2`
#[1] 3 7 9 10 12
#$`3`
#[1] 2 4 5 8 11
set.seed(1234)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )
#$`1`
#[1] 1 7 8
#$`2`
#[1] 2 3 4 6 9 10 12
#$`3`
#[1] 5 11
The first argument in sample is the number of groups to split the vector into. The second argument is the number of elements in the vector. This will randomly assign each successive element into one of 3 vectors. For 4 vectors just do split( test , sample(4, 12 , repl = TRUE) ).
It is easier than you think. To split the vector in three new randomly chosen sets run the following code:
test <- 1:12
split(sample(test), 1:3)
By doing so any time you run your this code you would get a new random distribution in three different sets(perfect for k-fold cross validation).
You get:
> split(sample(test), 1:3)
$`1`
[1] 5 8 7 3
$`2`
[1] 4 1 10 9
$`3`
[1] 2 11 12 6
> split(sample(test), 1:3)
$`1`
[1] 12 6 4 1
$`2`
[1] 3 8 7 5
$`3`
[1] 9 2 10 11
You could use an auxiliary vector to format the way you want to split your data. Example:
Data <- c(1,2,3,4,5,6)
Format <- c("X","Y","X","Y","Z,"Z")
output <- split(Data,Format)
Will generate the output:
$X
[1] 1 3
$Y
[1] 2 4
$Z
[1] 5 6
a questions from a relative n00b: I’d like to split a vector into three vectors of different lengths, with the values assigned to each vector at random. For example, I’d like to split the vector of length 12 below into vectors of length 2,3, and 7
I can get three equal sized vectors using this:
test<-1:12
split(test,sample(1:3))
Any suggestions on how to split test into vectors of 2,3, and 7 instead of three vectors of length 4?
You could use rep to create the indices for each group and then split based on that
split(1:12, rep(1:3, c(2, 3, 7)))
If you wanted the items to be randomly assigned so that it's not just the first 2 items in the first vector, the next 3 items in the second vector, ..., you could just add call to sample
split(1:12, sample(rep(1:3, c(2, 3, 7))))
If you don't have the specific lengths (2,3,7) in mind but just don't want it to be equal length vectors every time then SimonO101's answer is the way to go.
How about using sample slightly differently...
set.seed(123)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )
#$`1`
#[1] 1 6
#$`2`
#[1] 3 7 9 10 12
#$`3`
#[1] 2 4 5 8 11
set.seed(1234)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )
#$`1`
#[1] 1 7 8
#$`2`
#[1] 2 3 4 6 9 10 12
#$`3`
#[1] 5 11
The first argument in sample is the number of groups to split the vector into. The second argument is the number of elements in the vector. This will randomly assign each successive element into one of 3 vectors. For 4 vectors just do split( test , sample(4, 12 , repl = TRUE) ).
It is easier than you think. To split the vector in three new randomly chosen sets run the following code:
test <- 1:12
split(sample(test), 1:3)
By doing so any time you run your this code you would get a new random distribution in three different sets(perfect for k-fold cross validation).
You get:
> split(sample(test), 1:3)
$`1`
[1] 5 8 7 3
$`2`
[1] 4 1 10 9
$`3`
[1] 2 11 12 6
> split(sample(test), 1:3)
$`1`
[1] 12 6 4 1
$`2`
[1] 3 8 7 5
$`3`
[1] 9 2 10 11
You could use an auxiliary vector to format the way you want to split your data. Example:
Data <- c(1,2,3,4,5,6)
Format <- c("X","Y","X","Y","Z,"Z")
output <- split(Data,Format)
Will generate the output:
$X
[1] 1 3
$Y
[1] 2 4
$Z
[1] 5 6