Abnormal Sequencing in R - r

I would like to create a vector of sequenced numbers such as:
1,2,3,4,5, 2,3,4,5,1, 3,4,5,1,2
Whereby after a sequence is complete (say, rep(seq(1,5),3)), the first number of the previous sequence now moves to the last spot in the sequence.

%% to modulo?
(1:5) %% 5 + 1 # left shift by 1
[1] 2 3 4 5 1
(1:5 + 1) %% 5 + 1 # left shift by 2
[1] 3 4 5 1 2
also try
(1:5 - 2) %% 5 + 1 # right shift by 1
[1] 5 1 2 3 4
(1:5 - 3) %% 5 + 1 # right shift by 2
[1] 4 5 1 2 3

I would start off by making a matrix of one column longer than the length of the series.
> lseries <- 5
> nreps <- 3
> (values <- matrix(1:lseries, nrow = lseries + 1, ncol = nreps))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 4
[3,] 3 4 5
[4,] 4 5 1
[5,] 5 1 2
[6,] 1 2 3
This may throw a warning (In matrix(1:lseries, nrow = lseries + 1, ncol = nreps) : data length [5] is not a sub-multiple or multiple of the number of rows [6]) which you can ignore. Note, the first 1:lseries rows have the data you want. We can get the final result using:
> as.vector(values[1:lseries, ])
[1] 1 2 3 4 5 2 3 4 5 1 3 4 5 1 2

Here's method to get a matrix of each of these
matrix(1:5, 5, 6, byrow=TRUE)[, -6]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 2 3 4 5 1
[3,] 3 4 5 1 2
[4,] 4 5 1 2 3
[5,] 5 1 2 3 4
or turn it into a list
split.default(matrix(1:5, 5, 6, byrow=TRUE)[, -6], 1:5)
$`1`
[1] 1 2 3 4 5
$`2`
[1] 2 3 4 5 1
$`3`
[1] 3 4 5 1 2
$`4`
[1] 4 5 1 2 3
$`5`
[1] 5 1 2 3 4
or into a vector with c
c(matrix(1:5, 5, 6, byrow=TRUE)[, -6])
[1] 1 2 3 4 5 2 3 4 5 1 3 4 5 1 2 4 5 1 2 3 5 1 2 3 4
For the sake of variety, here is a second method to return the vector:
# construct the larger vector
temp <- rep(1:5, 6)
# use sapply with which to pull off matching positions, then take select position to drop
temp[-sapply(1:5, function(x) which(temp == x)[x+1])]
[1] 1 2 3 4 5 2 3 4 5 1 3 4 5 1 2 4 5 1 2 3 5 1 2 3 4

Related

Convert 3D array to tidy data frame?

I have a 3D array that looks like this:
# Create two vectors
vector1 <- c(1,2,3,4,5,6)
vector2 <- c(10, 11, 12, 13, 14, 15,16)
# Convert to 3D array
my_array <- array(c(vector1, vector2), dim = c(2,3,2))
print(my_array)
where the output is
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 10 12 14
[2,] 11 13 15
I would like to turn this into a tidy dataset, where is one row per value, and there are 4 columns for each of the values:
the value itself
dimension 1
dimension 2
dimension 3
so for example, a few rows would be
Value Dimension1(Row) Dimension2(Column) Dimension3(Width)
1 1 1 1
2 2 1 1
...
15 2 3 2
Is there a good way to do this in base R, or with tidyverse tools like tidyr?
We could use reshape2::melt
library(reshape2)
melt(my_array)
-output
Var1 Var2 Var3 value
1 1 1 1 1
2 2 1 1 2
3 1 2 1 3
4 2 2 1 4
5 1 3 1 5
6 2 3 1 6
7 1 1 2 10
8 2 1 2 11
9 1 2 2 12
10 2 2 2 13
11 1 3 2 14
12 2 3 2 15
Or use as.data.frame.table in base R
as.data.frame.table(my_array)
Or may also use
cbind(which(is.finite(my_array), arr.ind = TRUE), value = c(my_array))

Generating Permutations of Values Within Multiple Lists [duplicate]

This question already has an answer here:
All possible combinations of elements from different bins (one element from every bin) [duplicate]
(1 answer)
Closed 6 years ago.
I'm trying to generate permutations by taking 1 value from 3 different lists
l <- list(A=c(1:13), B=c(1:5), C=c(1:3))
Desired result => Matrix of all the permutations where the first value can be 1-13, second value can be 1-5, third value can be 1-3
I tried using permn from the combinat package, but it seems to just rearrange the 3 lists.
> permn(l)
[[1]]
[[1]]$A
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13
[[1]]$B
[1] 1 2 3 4 5
[[1]]$C
[1] 1 2 3
[[2]]
[[2]]$A
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13
[[2]]$C
[1] 1 2 3
[[2]]$B
[1] 1 2 3 4 5
....
Expected output
[,1] [,2] [,3]
[1,] 1 1 3
[2,] 1 2 1
[3,] 1 1 2
[4,] 1 1 3
and so on...
We can use expand.grid. It can directly be applied on the list
expand.grid(l)
You can create a data frame using do.call and expand.grid, if you really need a matrix, then use as.matrix on the result:
> l <- list(A=c(1:13), B=c(1:5), C=c(1:3))
> out <- do.call(expand.grid, l)
> head(out)
A B C
1 1 1 1
2 2 1 1
3 3 1 1
4 4 1 1
5 5 1 1
6 6 1 1
> tail(out)
A B C
190 8 5 3
191 9 5 3
192 10 5 3
193 11 5 3
194 12 5 3
195 13 5 3
> tail(as.matrix(out))
A B C
[190,] 8 5 3
[191,] 9 5 3
[192,] 10 5 3
[193,] 11 5 3
[194,] 12 5 3
[195,] 13 5 3
>

Convert matrix to three defined columns in R

Given m:
m <- structure(c(5, 1, 3, 2, 1, 4, 5, 2, 5, 1, 1, 5, 1, 4, 0, 4, 5,
5, 3, 2, 0, 0, 3, 0, 3, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(7L,
5L))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 5 2 0 0 0
# [2,] 1 5 4 3 0
# [3,] 3 1 5 0 0
# [4,] 2 1 5 3 0
# [5,] 1 5 3 2 0
# [6,] 4 1 2 3 0
# [7,] 5 4 0 0 0
Consider the element 1, it appears in 5 rows (2, 3, 4, ,5, 6) and the respective column-wise indices are (1, 2, 2, 1, 2). I would like to have the following:
1 2 1
1 3 2
1 4 2
1 5 1
1 6 2
As another example, consider the element 2, it appears in 4 rows (1, 4, 5, 6) and the respective column-wise indices are (2, 1, 4, 3) and we have:
1 2 1
1 3 2
1 4 2
1 5 1
1 6 2
2 1 2
2 4 1
2 5 4
2 6 3
What I want is a n*3 matrix for all 1-5. Preferably in base R
A convenient way to transform it is to use sparseMatrix from Matrix library, since your desired output is very close to the representation of sparse Matrix:
library(Matrix)
summary(Matrix(m, sparse = T))
# 7 x 5 sparse Matrix of class "dgCMatrix", with 23 entries
# i j x
# 1 1 1 5
# 2 2 1 1
# 3 3 1 3
# 4 4 1 2
# 5 5 1 1
# 6 6 1 4
# 7 7 1 5
# 8 1 2 2
# 9 2 2 5
# 10 3 2 1
# 11 4 2 1
# 12 5 2 5
# 13 6 2 1
# 14 7 2 4
# 15 2 3 4
# 16 3 3 5
# 17 4 3 5
# 18 5 3 3
# 19 6 3 2
# 20 2 4 3
# 21 4 4 3
# 22 5 4 2
# 23 6 4 3
To see it better:
summary(Matrix(m, sparse = T)) %>% dplyr::arrange(x)
# i j x
# 1 2 1 1
# 2 5 1 1
# 3 3 2 1
# 4 4 2 1
# 5 6 2 1
# 6 4 1 2
# 7 1 2 2
# 8 6 3 2
# 9 5 4 2
# 10 3 1 3
# 11 5 3 3
# 12 2 4 3
# 13 4 4 3
# 14 6 4 3
# 15 6 1 4
# 16 7 2 4
# 17 2 3 4
# 18 1 1 5
# 19 7 1 5
# 20 2 2 5
# 21 5 2 5
# 22 3 3 5
# 23 4 3 5
We can use which with arr.ind=TRUE
cbind(val= 1, which(m==1, arr.ind=TRUE))
# val row col
#[1,] 1 2 1
#[2,] 1 5 1
#[3,] 1 3 2
#[4,] 1 4 2
#[5,] 1 6 2
For multiple cases, as #RHertel mentioned
for(i in 1:5) print(cbind(i,which(m==i, arr.ind=TRUE)))
Or with lapply
do.call(rbind, lapply(1:2, function(i) {
m1 <-cbind(val=i,which(m==i, arr.ind=TRUE))
m1[order(m1[,2]),]}))
# val row col
#[1,] 1 2 1
#[2,] 1 3 2
#[3,] 1 4 2
#[4,] 1 5 1
#[5,] 1 6 2
#[6,] 2 1 2
#[7,] 2 4 1
#[8,] 2 5 4
#[9,] 2 6 3
As the OP mentioned about base R solutions, the above would help. But, in case, if somebody wants a compact solution,
library(reshape2)
melt(m)
and then subset the values of interest.
Just use row and col.
> data.frame(m=as.vector(m), row=as.vector(row(m)), col=as.vector(col(m)))
m row col
1 5 1 1
2 1 2 1
3 3 3 1
4 2 4 1
5 1 5 1
...
Subset, sort, and print as desired.
> tmp <- out[order(out$m, out$row), ]
> print(subset(tmp, m==1), row.names=FALSE)
m row col
1 2 1
1 3 2
1 4 2
1 5 1
1 6 2

Build summary table from matrix

I have this matrix
mdat <- matrix(c(0,1,1,1,0,0,1,1,0,1,1,1,1,0,1,1,1,1,0,1), nrow = 4, ncol = 5, byrow = TRUE)
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 1 1 0
[2,] 0 1 1 0 1
[3,] 1 1 1 0 1
[4,] 1 1 1 0 1
and I'm trying to build T:
T1 T2 T3
row1 1 2 4
row2 2 2 3
row3 2 5 5
row4 3 1 3
row5 3 5 5
row6 4 1 3
row7 4 5 5
where for each row in mdat:
T1 shows mdat row number
T2 shows mdat column where there's the first 1
T3 shows mdat column where there's the last consecutive 1.
Therefore
row1 in T is [1 2 4] because for row 1 in mdat the first 1 is in column 2 and the last consecutive 1 is in column 4.
row2 in T is [2 2 3] because for row 2 in mdat the first 1 is in column 2 and the last consecutive 1 is in column 3.
This is my try:
for (i in 1:4){
for (j in 1:5) {
if (mdat[i,j]==1) {T[i,1]<-i;T[i,2]<-j;
cont<-0;
while (mdat[i,j+cont]==1){
cont<-cont+1;
T[i,3]<-cont}
}
}
}
Here's a strategy using apply/rle as Richard suggested.
xx<-apply(mdat, 1, function(x) {
r <- rle(x)
w <- which(r$values==1)
l <- r$lengths[w]
s <- cumsum(c(0,r$lengths))[w]+1
cbind(start=s,stop=s+l-1)
})
do.call(rbind, Map(cbind, row=seq_along(xx), xx))
We start by finding the runs of 1 on each row using the "values" property of the rle and we calculate their start and stop positions using the "lengths" property. We turn this data into a list of two column matrices with one list item per row of the original matrix.
Now we use Map to add the row number back onto the matrix and then we rbind all the results. That seems to give you the data you're after
row start stop
[1,] 1 2 4
[2,] 2 2 3
[3,] 2 5 5
[4,] 3 1 3
[5,] 3 5 5
[6,] 4 1 3
[7,] 4 5 5
Try the Bioconductor IRanges package:
library(IRanges)
r <- unlist(slice(split(Rle(mdat), row(mdat)), 1, rangesOnly=TRUE)))
r
IRanges of length 7
start end width names
[1] 2 4 3 1
[2] 2 3 2 2
[3] 5 5 1 2
[4] 1 3 3 3
[5] 5 5 1 3
[6] 1 3 3 4
[7] 5 5 1 4
EDIT: optimized

convert rows after column

I have csv file which reads like this
1 5
2 3
3 2
4 6
5 3
6 7
7 2
8 1
9 1
What I want to do is to this:
1 5 4 6 7 2
2 3 5 3 8 1
3 2 6 7 9 1
i.e after every third row, I want a different column of the values side by side. Any advise?
Thanks a lot
Here's a way to do this with matrix indexing. It's a bit strange, but I find it interesting so I will post it.
You want an index matrix, with indices as follows. This gives the order of your data as a matrix (column-major order):
1, 1
2, 1
3, 1
1, 2
2, 2
3, 2
4, 1
...
8, 2
9, 2
This gives the pattern that you need to select the elements. Here's one approach to building such a matrix. Say that your data is in the object dat, a data frame or matrix:
m <- matrix(
c(
outer(rep(1:3, 2), seq(0,nrow(dat)-1,by=3), FUN='+'),
rep(rep(1:2, each=3), nrow(dat)/3)
),
ncol=2
)
The outer expression is the first column of the desired index matrix, and the rep expression is the second column. Now just index dat with this index matrix, and build a result matrix with three rows:
matrix(dat[m], nrow=3)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 4 6 7 2
## [2,] 2 3 5 3 8 1
## [3,] 3 2 6 7 9 1
a <- read.table(text = "1 5
2 3
3 2
4 6
5 3
6 7
7 2
8 1
9 1")
(seq_len(nrow(a))-1) %/% 3
# [1] 0 0 0 1 1 1 2 2 2
split(a, (seq_len(nrow(a))-1) %/% 3)
# $`0`
# V1 V2
# 1 1 5
# 2 2 3
# 3 3 2
# $`1`
# V1 V2
# 4 4 6
# 5 5 3
# 6 6 7
# $`2`
# V1 V2
# 7 7 2
# 8 8 1
# 9 9 1
do.call(cbind,split(a, (seq_len(nrow(a))-1) %/% 3))
# 0.V1 0.V2 1.V1 1.V2 2.V1 2.V2
# 1 1 5 4 6 7 2
# 2 2 3 5 3 8 1
# 3 3 2 6 7 9 1

Resources