Linear regression for each cell in a table - r

I have four tables. Each of them got 4 rows and 4 columns. Followings are the four tables.
For the 1st table,
t1 <- array(1:20, dim=c(4,4))
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
For the 2nd table,
t2 <- array(6:25, dim=c(4,4))
[,1] [,2] [,3] [,4]
[1,] 6 10 14 18
[2,] 7 11 15 19
[3,] 8 12 16 20
[4,] 9 13 17 21
For the 3rd table,
t3 <- array(11:30, dim=c(4,4))
[,1] [,2] [,3] [,4]
[1,] 11 15 19 23
[2,] 12 16 20 24
[3,] 13 17 21 25
[4,] 14 18 22 26
For the 4th table,
t4 <- array(21:30, dim=c(4,4))
[,1] [,2] [,3] [,4]
[1,] 21 25 29 23
[2,] 22 26 30 24
[3,] 23 27 21 25
[4,] 24 28 22 26
For each tables, I got a fixed set of y-value.
t1 = 0.1
t2 = 3
t4 = 0.5
t6 = 7
In other words:
y <- c( 0.1, 3, 0.75, 7)
Then, I want to extract x values from each of the cell in the four tables. That is for the [1,1] cell, the x-values extacted should be (0.1, 3, 0.5, 7). We repeats this step one by one till the end of the table, i.e. the [4,4] cell. Thus, I got a total of 16 sets of x-values as folows:
cell x-values
[1,1] (1,6,11,21)
[1,2] (5,10,15,25)
…..
[4,4] (16, 21,26,26)
Then I try to calculate the R2 for linear regression for each y-x pairs. In other word, I want to got a total of 16 R2 values as follows:
For [1,1] cell, linear regression between (0.1, 3, 0.5, 7) and (1,6,11,21) = 0.6853
For [1,2] cell, linear regression between (0.1, 3, 0.5, 7) and (5,10,15,25) = 0.6853
…..
For [4,4] cell, linear regression between (0.1, 3, 0.5, 7) and (16, 21,26,26) = 0.2719
Finally, I want to get a table with the following two columns
cell R2
[1,1] 0.6853
[1,2] 0.6853
….
[4,4] 0.2719
I learnt that to do linear regression for x and y series of data, I can use following command:
Rcoefficient <- summary(lm(y ~ x, data=faithful))$r.squared
However, I have trouble readin each set of x-values from the four tables. I tried to use reshape, but I still cannot get it right. Could experts in Stackoverflow, help to suggest an efficient way to do it with R, as my real tables are very large with over 1000 columns and rows.
Thanks a lot.

I would[*] manipulate the arrays in place, by concatenating them into a 4 x 4 x 4 array:
t1 <- array(1:20, dim=c(4,4))
t2 <- array(6:25, dim=c(4,4))
t3 <- array(11:30, dim=c(4,4))
t4 <- array(21:30, dim=c(4,4))
tt <- array(c(t1,t2,t3,t4), dim = c(4,4,4))
## now you can remove the original arrays
which gives:
> tt
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
, , 2
[,1] [,2] [,3] [,4]
[1,] 6 10 14 18
[2,] 7 11 15 19
[3,] 8 12 16 20
[4,] 9 13 17 21
, , 3
[,1] [,2] [,3] [,4]
[1,] 11 15 19 23
[2,] 12 16 20 24
[3,] 13 17 21 25
[4,] 14 18 22 26
, , 4
[,1] [,2] [,3] [,4]
[1,] 21 25 29 23
[2,] 22 26 30 24
[3,] 23 27 21 25
[4,] 24 28 22 26
Then we use aperm() to rearrange the dimensions of the array so that the indices you requested are in the right order. We create a matrix from this array as a final step.
X <- matrix(aperm(tt, c(3,1,2)), ncol = 4, byrow = TRUE)
The aperm(tt, c(3,1,2)) step produces
> aperm(tt, c(3,1,2))
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 6 7 8 9
[3,] 11 12 13 14
[4,] 21 22 23 24
, , 2
[,1] [,2] [,3] [,4]
[1,] 5 6 7 8
[2,] 10 11 12 13
[3,] 15 16 17 18
[4,] 25 26 27 28
, , 3
[,1] [,2] [,3] [,4]
[1,] 9 10 11 12
[2,] 14 15 16 17
[3,] 19 20 21 22
[4,] 29 30 21 22
, , 4
[,1] [,2] [,3] [,4]
[1,] 13 14 15 16
[2,] 18 19 20 21
[3,] 23 24 25 26
[4,] 23 24 25 26
where the indices you want are in columns, which we exploit when creating the matrix as R will treat the permuted array as a vector filled from the columns of the permuted array. X results in
> X
[,1] [,2] [,3] [,4]
[1,] 1 6 11 21
[2,] 2 7 12 22
[3,] 3 8 13 23
[4,] 4 9 14 24
[5,] 5 10 15 25
[6,] 6 11 16 26
[7,] 7 12 17 27
[8,] 8 13 18 28
[9,] 9 14 19 29
[10,] 10 15 20 30
[11,] 11 16 21 21
[12,] 12 17 22 22
[13,] 13 18 23 23
[14,] 14 19 24 24
[15,] 15 20 25 25
[16,] 16 21 26 26
Then we can proceed as per #January's answer and fit the regression (though note I explicitly pass in y as the scoping rules of lm() are non-standard and I'm being defensive.)
y <- c( 0.1, 3, 0.75, 7)
r2 <- apply(X, 1, function(x, y) summary(lm(y ~ x))$r.squared, y = y)
This results in:
> head(r2)
[1] 0.7160542 0.7160542 0.7160542 0.7160542 0.7160542 0.7160542
Note that there is an inconsistency in your text and code. You state the response is (0.1, 3, 0.5, 7) but define y as c( 0.1, 3, 0.75, 7). The results I show use the latter but your results used the former, hence the difference.
[*] without knowing more about the context I'm not sure I'd be wanting to fit millions of linear models...

First, let's reformat the data.
EDIT: this code is less then optimal, see Gavins solution in the other answer.
t <- NULL
for( row in 1:nrow( t1 ) ) {
for( col in 1:ncol( t1 ) ) {
t <- rbind( t, c( t1[ row, col ], t2[ row, col ], t3[ row, col ], t4[ row, col ] ) )
}
}
This will produce a matrix with four columns (one for each table), and nrow * ncol rows - as many rows as you have cells in one table. Check it with dim( t ). Running regression is now easy:
apply( t, 1, function( x ) { summary( lm( y ~ x ) )$r.squared )

Related

shift matrix elements in R

n <- 5
a <- matrix(c(1:n**2),nrow = n, byrow = T)
output is
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 6 7 8 9 10
[3,] 11 12 13 14 15
[4,] 16 17 18 19 20
[5,] 21 22 23 24 25
how do I shift the '1' to the current position of '25' to look like this:
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 7 8 9 10 11
[3,] 12 13 14 15 16
[4,] 17 18 19 20 21
[5,] 22 23 24 25 1
a <- t(a); a[] <- c(a[-1], a[1]); a <- t(a)
a
# [,1] [,2] [,3] [,4] [,5]
# [1,] 2 3 4 5 6
# [2,] 7 8 9 10 11
# [3,] 12 13 14 15 16
# [4,] 17 18 19 20 21
# [5,] 22 23 24 25 1
c(a) unwinds or unlists the matrix into a vector. It does this column-first, so c(a) results in [1] 1 6 11 16 21 2 .... We want it to be row-first, though, so
t(a) transposes it, so that what was a row-first is now column-first, allowing c(a) and such to work.
c(a[-1], a[1]) is just "concatenate all except the first with the first", the classic way to put the first element of a vector at the end.
a[] <- is a way to do calcs on its values where the calcs do not preserve the "dimensionality" of the object.
After we've rearranged, we then transpose back to the original shape and row/column-order.
Here is a base R one-liner
> t(`dim<-`(t(a)[seq_along(a)%%length(a)+1],rev(dim(a))))
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 7 8 9 10 11
[3,] 12 13 14 15 16
[4,] 17 18 19 20 21
[5,] 22 23 24 25 1

Vector to list of matrices

I´m trying convert a vector to multiple matrices and save them in a list.
#Create list to save matrix
BSEPRA=vector("list", 420)
#Vector size 6720
temporalRA
I need to build 4*4 size matrices with the first 16 elements then with the next ones (17:32) and so on up to 6705:6720 to have 420 and save them in the list. But this doesn't work:
for (i in 1:length(temporalRA)){
temp2<-matrix(temporalRA[seq(1,6720, 16), ],nrow = 4,ncol = 4, )
BSEPRA[[i]]=temp2
}
vec <- 1:32
split(vec, (seq_along(vec) - 1) %/% (4*4))
# $`0`
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# $`1`
# [1] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
lapply(split(vec, (seq_along(vec) - 1) %/% 16), matrix, nrow = 4, ncol = 4)
# $`0`
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# $`1`
# [,1] [,2] [,3] [,4]
# [1,] 17 21 25 29
# [2,] 18 22 26 30
# [3,] 19 23 27 31
# [4,] 20 24 28 32
If there's any concern that you will not have a perfect multiple of 16, I encourage you to take one extra step.
The problem:
vec <- 1:30
vecs <- split(vec, (seq_along(vec) - 1) %/% (4*4))
vecs
# $`0`
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# $`1`
# [1] 17 18 19 20 21 22 23 24 25 26 27 28 29 30
lapply(vecs, matrix, nrow = 4)
# Warning in FUN(X[[i]], ...) :
# data length [14] is not a sub-multiple or multiple of the number of rows [4]
# $`0`
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# $`1`
# [,1] [,2] [,3] [,4]
# [1,] 17 21 25 29
# [2,] 18 22 26 30
# [3,] 19 23 27 17
# [4,] 20 24 28 18
(Notice how 17:18 are "recycled".)
The fix:
vec <- 1:30
vecs <- split(vec, (seq_along(vec) - 1) %/% (4*4))
vecs <- lapply(vecs, `length<-`, max(lengths(vecs)))
vecs
# $`0`
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# $`1`
# [1] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 NA NA
lapply(vecs, matrix, nrow = 4)
# $`0`
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# $`1`
# [,1] [,2] [,3] [,4]
# [1,] 17 21 25 29
# [2,] 18 22 26 30
# [3,] 19 23 27 NA
# [4,] 20 24 28 NA
Another approach that I think will be more efficient than splitting the vector and iterating over the list to make matrices is to make an array and split it:
vec <- 1:32
mdim <- 4
mdimsq <- mdim^2
asplit(array(vec, dim = c(mdim, mdim, length(vec) / mdimsq)), 3)
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
[[2]]
[,1] [,2] [,3] [,4]
[1,] 17 21 25 29
[2,] 18 22 26 30
[3,] 19 23 27 31
[4,] 20 24 28 32
Likewise, if the vector is not a perfect multiple it can be padded with NA:
vec <- 1:30
vln <- ceiling(length(vec)/mdimsq) * mdimsq
if (length(vec) < vln) vec[vln] <- NA
asplit(array(vec, dim = c(mdim, mdim, length(vec) / mdimsq)), 3)
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
[[2]]
[,1] [,2] [,3] [,4]
[1,] 17 21 25 29
[2,] 18 22 26 30
[3,] 19 23 27 NA
[4,] 20 24 28 NA

How to rearrange matrices?

I need to create a function, that will rearrange any square matrix based on the values in the matrix.
So if I have matrix like this:
M <- matrix(1:16, ncol = 4)
M
#> [,1] [,2] [,3] [,4]
#> [1,] 1 5 9 13
#> [2,] 2 6 10 14
#> [3,] 3 7 11 15
#> [4,] 4 8 12 16
After rearrangement it needs to look like this:
[,1] [,2] [,3] [,4]
[1,] 1 3 6 10
[2,] 2 5 9 13
[3,] 4 8 12 15
[4,] 7 11 14 16
So it is sorted from lowest (left upper corner) to highest (right lower corner), but the numbers are sorted on diagonal (is that the right word?) not in rows or columns.
I know how to to this "manually", but I can't figure out any rules that this rearrangement operates by.
1) row(m) + col(m) is constant along reverse diagonals so:
M <- replace(m, order(row(m) + col(m)), m)
gving:
> M
[,1] [,2] [,3] [,4]
[1,] 1 3 6 10
[2,] 2 5 9 13
[3,] 4 8 12 15
[4,] 7 11 14 16
It is not clear whether sorted on the diagonal means just that they are unravelled from the storage order onto the reverse diagonals or that they are actually sorted after that within each reverse diagonal. In the example in the question the two interpretations give the same answer; however, if you did wish to sort the result within reverse diagonal afterwards using different data then apply this:
ave(M, row(M) + col(M), FUN = sort)
2) A longer version:
M2 <- matrix(m[order(unlist(tapply(seq_along(m), row(m) + col(m), c)))], nrow(m))
Here's a function columns_to_diagonals in base R that ought to do what you're after. It uses split and unsplit with the appropriate factors.
columns_to_diagonals <- function(M) {
n <- ncol(M)
f <- matrix(rep(1:(2*n-1), c(1:n, (n-1):1)), ncol = n)
m <- split(M, f)
d <- row(M) + col(M)
matrix(unsplit(m, d), ncol = n)
}
First, we may test this on your original case:
M <- matrix(1:16, ncol = 4)
columns_to_diagonals(M)
#> [,1] [,2] [,3] [,4]
#> [1,] 1 3 6 10
#> [2,] 2 5 9 13
#> [3,] 4 8 12 15
#> [4,] 7 11 14 16
And then a larger, randomly permutated matrix, to check that this looks fine as well:
M <- matrix(sample(1:25), ncol = 5)
M
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 4 15 12 10 21
#> [2,] 19 7 5 23 6
#> [3,] 9 17 2 8 1
#> [4,] 3 11 16 25 14
#> [5,] 22 18 20 13 24
columns_to_diagonals(M)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 4 9 15 18 20
#> [2,] 19 22 11 16 25
#> [3,] 3 17 2 8 6
#> [4,] 7 5 23 21 14
#> [5,] 12 10 13 1 24
Created on 2019-12-15 by the reprex package (v0.2.1)

Moving rows between subarrays

I have a number of subarrays, say 2 (for simplicity), each with the same number of rows and columns. Each spot in the subarrays is occupied by a number in [1, 10].
What I would like to do is move rows randomly between subarrays according to some rate of movement m = [0, 1]. m = 0 corresponds to no movement, while m = 1 means that any rows across all subarrays can be moved.
I take inspiration from:
How to swap a number of the values between 2 rows in R
but my problem is a bit different than this. I do know that sample() would be needed here.
Is there an easy way to go about accomplishing this?
This doesn't do it, but I believe I'm on the right track anyway.
m <- 0.2
a <- array(dim = c(5, 5, 2)) # 5 rows, 5 columns, 2 subarrays
res <- rep(sample(nrow(a), size = ceiling(nrow(a)*m), replace = FALSE)) # sample 20% of rows from array a.
Any assistance is appreciated.
It is significantly easier if you can use a matrix (2-dim array).
set.seed(2)
m <- 0.2
d <- c(10, 4)
a <- array(sample(prod(d)), dim = d)
a
# [,1] [,2] [,3] [,4]
# [1,] 8 17 14 1
# [2,] 28 37 40 26
# [3,] 22 38 16 29
# [4,] 7 35 3 32
# [5,] 34 11 23 4
# [6,] 36 33 19 31
# [7,] 5 24 30 13
# [8,] 39 6 27 25
# [9,] 15 10 12 9
# [10,] 18 2 21 20
(I'm going to set the seed again to something that conveniently gives me something "interesting" to show.)
set.seed(2)
ind <- which(runif(d[1]) < m)
ind
# [1] 1 4 7
The first randomness, runif, is compared against m and generates the indices that may change. The second randomness, sample below, takes those indices and possibly reorders them. (In this case, it reorders "1,4,7" to "4,1,7", meaning the third of the rows-that-may-change will be left unchanged.)
a[ind,] <- a[sample(ind),]
a
# [,1] [,2] [,3] [,4]
# [1,] 7 35 3 32 # <-- row 4
# [2,] 28 37 40 26
# [3,] 22 38 16 29
# [4,] 8 17 14 1 # <-- row 1
# [5,] 34 11 23 4
# [6,] 36 33 19 31
# [7,] 5 24 30 13 # <-- row 7, unchanged
# [8,] 39 6 27 25
# [9,] 15 10 12 9
# [10,] 18 2 21 20
Note that this is probabilistic, which means a probability of 0.2 does not guarantee you 20% (or even any) of the rows will be swapped.
(Since I'm guessing you'd really like to preserve your 3-dim (or even n-dim) array, you might be able to use aperm to transfer between array <--> matrix.)
EDIT 1
As an alternative to a probabilitic use of runif, you can use:
ind <- head(sample(d[1]),size=d[1]*m)
to get closer to your goal of "20%". Since d[1]*m will often not be an integer, head silently truncates/floors the number, so you'll get the price-is-right winner: closest to but not over your desired percentage.
EDIT 2
A reversible method for transforming an n-dimensional array into a matrix and back again. Caveat: though the logic appears solid, my testing has only included a couple arrays.
array2matrix <- function(a) {
d <- dim(a)
ind <- seq_along(d)
a2 <- aperm(a, c(ind[2], ind[-2]))
dim(a2) <- c(d[2], prod(d[-2]))
a2 <- t(a2)
attr(a2, "origdim") <- d
a2
}
The reversal uses the "origdim" attribute if still present; this will work as long as your modifications to the matrix do not clear its attributes. (Simple row-swapping does not.)
matrix2array <- function(m, d = attr(m, "origdim")) {
ind <- seq_along(d)
m2 <- t(m)
dim(m2) <- c(d[2], d[-2])
aperm(m2, c(ind[2], ind[-2]))
}
(These two functions should probably do some more error-checks, such as is.null(d).)
A sample run:
set.seed(2)
dims <- 5:2
a <- array(sample(prod(dims)), dim=dims)
Quick show:
a[,,1,1:2,drop=FALSE]
# , , 1, 1
# [,1] [,2] [,3] [,4]
# [1,] 23 109 61 90
# [2,] 84 15 27 102
# [3,] 68 95 83 24
# [4,] 20 53 117 46
# [5,] 110 62 43 8
# , , 1, 2
# [,1] [,2] [,3] [,4]
# [1,] 118 25 14 93
# [2,] 65 21 16 77
# [3,] 87 82 3 38
# [4,] 92 12 78 17
# [5,] 49 4 75 80
The transformation:
m <- array2matrix(a)
dim(m)
# [1] 30 4
head(m)
# [,1] [,2] [,3] [,4]
# [1,] 23 109 61 90
# [2,] 84 15 27 102
# [3,] 68 95 83 24
# [4,] 20 53 117 46
# [5,] 110 62 43 8
# [6,] 67 47 1 54
Proof of reversability:
identical(matrix2array(m), a)
# [1] TRUE
EDIT 3, "WRAP UP of all code"
Creating fake data:
dims <- c(5,4,2)
(a <- array(seq(prod(dims)), dim=dims))
# , , 1
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 2 7 12 17
# [3,] 3 8 13 18
# [4,] 4 9 14 19
# [5,] 5 10 15 20
# , , 2
# [,1] [,2] [,3] [,4]
# [1,] 21 26 31 36
# [2,] 22 27 32 37
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 25 30 35 40
(m <- array2matrix(a))
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 2 7 12 17
# [3,] 3 8 13 18
# [4,] 4 9 14 19
# [5,] 5 10 15 20
# [6,] 21 26 31 36
# [7,] 22 27 32 37
# [8,] 23 28 33 38
# [9,] 24 29 34 39
# [10,] 25 30 35 40
# attr(,"origdim")
# [1] 5 4 2
The random-swapping of rows. I'm using 50% here.
pct <- 0.5
nr <- nrow(m)
set.seed(3)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 2 8 4 3 9
(ind2 <- sample(ind1))
# [1] 3 2 9 8 4
m[ind1,] <- m[ind2,]
m
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 3 8 13 18
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 5 10 15 20
# [6,] 21 26 31 36
# [7,] 22 27 32 37
# [8,] 2 7 12 17
# [9,] 4 9 14 19
# [10,] 25 30 35 40
# attr(,"origdim")
# [1] 5 4 2
(Note that I pre-made ind1 and ind2 here, mostly to see what was going on internally. You can replace m[ind2,] with m[sample(ind1),] for the same effect.)
BTW: if we had instead used a seed of 2, we would notice that 2 rows are not swapped:
set.seed(2)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 2 7 5 10 6
(ind2 <- sample(ind1))
# [1] 6 2 5 10 7
Because of this, I chose a seed of 3 for demonstration. However, this may give the appearance of things not working. Lacking more controlling code, sample does not ensure that positions change: it is certainly reasonable to expect that "randomly swap rows" could randomly choose to move row 2 to row 2. Take for example:
set.seed(267)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 3 6 5 7 2
(ind2 <- sample(ind1))
# [1] 3 6 5 7 2
The first randomly chooses five rows, and then reorders them randomly into an unchanged order. (I suggest that if you want to force that they are all movements, you should ask a new question asking about just forcing a sample vector to change.)
Anyway, we can regain the original dimensionality with the second function:
(a2 <- matrix2array(m))
# , , 1
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 3 8 13 18
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 5 10 15 20
# , , 2
# [,1] [,2] [,3] [,4]
# [1,] 21 26 31 36
# [2,] 22 27 32 37
# [3,] 2 7 12 17
# [4,] 4 9 14 19
# [5,] 25 30 35 40
In the first plane of the array, rows 1 and 5 are unchanged; in the second plane, rows 1, 2, and 5 are unchanged. Five rows the same, five rows moved around (but otherwise unchanged within each row).

R: summing over an interval of rows

I'm a beginner R user and I need to write a function that sums the rows of a data frame over a fixed interval (every 4 rows).
I've tried the following code
camp<-function(X){
i<-1
n<-nrow(X)
xc<-matrix(nrow=36,ncol=m)
for (i in 1:n){
xc<-apply(X[i:(i+4),],2,sum)
rownames(xc[i])<-rownames(X[i])
i<-i+5
}
return(xc)
}
the result is "Error in X[i:(i + 4), ] : index out of range".
How can I solve? Any suggestion?
Thanks.
The zoo package has rollapply which is pretty handy for stuff like this...
# Make some data
set.seed(1)
m <- matrix( sample( 10 , 32 , repl = TRUE ) , 8 )
# [,1] [,2] [,3] [,4]
#[1,] 3 7 8 3
#[2,] 4 1 10 4
#[3,] 6 3 4 1
#[4,] 10 2 8 4
#[5,] 3 7 10 9
#[6,] 9 4 3 4
#[7,] 10 8 7 5
#[8,] 7 5 2 6
# Sum every 4 rows
require( zoo )
tmp <- rollapply( m , width = 4 , by = 4 , align = "left" , FUN = sum )
# [,1] [,2] [,3] [,4]
#[1,] 23 13 30 12
#[2,] 29 24 22 24
You can also use rowSums() on the result if you actually wanted to aggregate the columns into a single value for each of the 4 rows...
rowSums( tmp )
#[1] 78 99
Here is a way to do it :
## Sample data
m <- matrix(1:36, nrow=12)
## Create a "group" index
fac <- (seq_len(nrow(m))-1) %/% 4
## Apply sum
apply(m, 2, function(v) tapply(v, fac, sum))
Sample data :
[,1] [,2] [,3]
[1,] 1 13 25
[2,] 2 14 26
[3,] 3 15 27
[4,] 4 16 28
[5,] 5 17 29
[6,] 6 18 30
[7,] 7 19 31
[8,] 8 20 32
[9,] 9 21 33
[10,] 10 22 34
[11,] 11 23 35
[12,] 12 24 36
Result :
[,1] [,2] [,3]
0 10 58 106
1 26 74 122
2 42 90 138

Resources