Joining data by 'rbind' as a loop in R - r

I have two equally long dataset - 'vpXmin' and 'vpXmax' created from 'vp'
> head(vpXmin)
vp
[1,] 253641 2621722
[2,] 253641 2622722
[3,] 253641 2623722
[4,] 253641 2624722
[5,] 253641 2625722
[6,] 253641 2626722
> head(vpXmax)
vp
[1,] 268641 2621722
[2,] 268641 2622722
[3,] 268641 2623722
[4,] 268641 2624722
[5,] 268641 2625722
[6,] 268641 2626722
I want to join each of the rows from these datasets using 'rbind' and want to create separate matrix; e.g.
l1<-rbind(vpXmax[1,],vpXmin[1,])
l2<-rbind(vpXmax[2,],vpXmin[2,])
... ...
Even though I'm not familiar with R loops, I want to deal with such a large data as a loop ... but I failed while trying this:
for (i in 1:length(vp)){rbind(vpXmax[i,],vpXmin[i,])}
Any idea why? Also, please gimme some good references for learning different kinds of loops using R, if any. thanks in advance.

Maybe something like:
vpXmax <- matrix(1:10,ncol=2)
vpXmin <- matrix(11:20,ncol=2)
l <- lapply(1:nrow(vpXmin),function(i) rbind(vpXmax[i,],vpXmin[i,]) )
Then, instead of l1, l2 etc etc you have
l[[1]]
# [,1] [,2]
#[1,] 1 6
#[2,] 11 16
l[[2]]
# [,1] [,2]
#[1,] 2 7
#[2,] 12 17
And although it is probably not ideal, there is one major thing wrong with your initial loop.
You aren't assigning your output, so you need to use assign or <- in some way to actually make an object. However, using assign, is pretty much a flag to set off alarm bells that there is a better way to do things, and <- would require pre-allocating or other stuffing around.
Nevertheless, it will work, albeit polluting your work space with l1 l2... ln objects:
for (i in 1:nrow(vpXmax)) {assign(paste0("l",i), rbind(vpXmax[i,],vpXmin[i,]) )}
> l1
# [,1] [,2]
#[1,] 1 6
#[2,] 11 16
> l2
# [,1] [,2]
#[1,] 2 7
#[2,] 12 17

As #ToNoy indicates, it is not obvious the kind of output that you want. The easiest way to proceed would be to create a list in which each element is the result of rbind each row of the two original data frames.
A <- data.frame("a" = runif(100, -1, 0), "b" = runif(100, 0, 1))
Z <- data.frame("a" = runif(100, -2, -1), "b" = runif(100, 1, 2))
output <- vector("list", nrow(A))
for (i in 1:nrow(A)) {
output[[i]] <- rbind(A[i, ], Z[i, ])
}

Related

Convert list to matrix using indicator vector

I have the following list of 5x2 matrices:
l <- list(a=matrix(rnorm(10),nrow=5,ncol=2),
b=matrix(rnorm(10),nrow=5,ncol=2),
c=matrix(rnorm(10),nrow=5,ncol=2))
For example, the first element of this list looks like this:
$a
[,1] [,2]
[1,] -0.4988268 1.9881333
[2,] -0.2979064 1.5921169
[3,] -1.3783522 -1.4149601
[4,] 0.2205115 0.2029210
[5,] 1.2721645 0.2861253
I want to take this list and create a new 5x2 matrix using information from a vector v:
v <- c("a","a","b","c","b")
This vector is an indicator vector that has information on how this new matrix should be constructed. That is, take row 1 from list element a, take row 2 from list element a and so on.
One could do it through a for-loop, however, for my application this is not efficient enough and I feel there might be a more elegant solution to it. My approach:
goal <- matrix(nrow=5,ncol=2)
for(i in 1:length(v)){
goal[i,] <- l[[v[i]]][i,]
}
goal
[,1] [,2]
[1,] -0.4988268 1.98813326
[2,] -0.2979064 1.59211686
[3,] 0.7715907 0.16776669
[4,] 0.2690278 0.02542766
[5,] 1.7865093 0.46361239
Thanks!
Assuming all the list matrices have same number of row, we could use mapply and subset the matrices by name (v) and row number.
t(mapply(function(x, y) l[[x]][y, ], v, 1:nrow(l[[1]])))
# [,1] [,2]
#a -1.2070657 0.5060559
#a 0.2774292 -0.5747400
#b -0.7762539 -0.9111954
#c 0.4595894 -0.0151383
#b 0.9594941 2.4158352
data
set.seed(1234)
l <- list(a=matrix(rnorm(10),nrow=5,ncol=2),
b=matrix(rnorm(10),nrow=5,ncol=2),
c=matrix(rnorm(10),nrow=5,ncol=2))

Performing element-wise standard deviation in R with two matrices

As the title suggests, I am looking for a way to get the standard deviation per element from two separate matrices. However, I am quite the beginner at R and I can't seem to figue out how to do this. Below is an example of what I am trying to accomplish with a small sample of my data (first three rows)
I have two matrices with coordinates (df143 and df143_2, or matrices A and B as you will)
A:
[1,] 21.729504 -55.66055 -37.26477
[2,] 39.445610 -67.67449 -32.19464
[3,] 57.604027 -54.16734 -28.48679
B:
[1,] 21.706865 -55.50722 -37.57840
[2,] 39.553314 -67.68414 -31.95995
[3,] 57.286247 -54.13008 -28.44446
I am looking for an matrix output that shows the standard deviation per element of the two combined matrices.
Or you can do base R:
matrix(mapply(function(x,y) sd(c(x,y)),A, B), ncol=ncol(A))
# [,1] [,2] [,3]
#[1,] 0.01600819 0.10842068 0.22176990
#[2,] 0.07615823 0.00682358 0.16595089
#[3,] 0.22470439 0.02634680 0.02993183
I believe this is what you're looking to do:
library(abind)
a <- c(21.729504, -55.66055, -37.26477, 39.445610, -67.67449, -32.19464, 57.604027, -54.16734, -28.48679)
a <- matrix(a, ncol=3, byrow=TRUE)
b <- c(21.706865, -55.50722, -37.57840, 39.553314, -67.68414, -31.95995, 57.286247, -54.13008, -28.44446)
b <- matrix(b, ncol=3, byrow=TRUE)
m <- abind(a, b, along=3)
apply(m, 1:2, sd)
## [,1] [,2] [,3]
## [1,] 0.01600819 0.10842068 0.22176990
## [2,] 0.07615823 0.00682358 0.16595089
## [3,] 0.22470439 0.02634680 0.02993183

R Loop and Matrices

I am trying to get this simple 'for loop' to work. I can't get dim(F4) to be a 6848x2 matrix. I just want to divide the row entries of two matrices. Here's what I have...
> dim(F3)
[1] 6848 2
> head(F3)
[,1] [,2]
[1,] 140.9838 516.0239
[2,] 140.9838 516.0239
[3,] 140.9838 516.0239
[4,] 140.9838 516.0239
[5,] 140.9838 516.0239
[6,] 175.5093 515.2280
> dim(scale)
[1] 6848 1
F4 <- matrix(, nrow = nrow(F1), ncol = 1)
for (i in 1:t){
F4[i,]<-(F3[i]/scale[i])} #ONLY WANT F3(i) ROW TO BE DIVIDED BY SCALE(i) ROW
> dim(F4) #DOESN'T GIVE ME 6848x2 Matrix
[1] 6848 1
No need to use a for loop here. Here a vectorized solution:
F3/as.vector(sacle) ## BAD! use of built-in function "scale" as a variable!
Example :
mat <- matrix(1:8,4,2)
sx <- matrix(1:4,4,1)
mat /as.vector(sx)
The use of as.vector to get-rid of matrix division dimensions.

How to use some apply function to solve what requires two for-loops in R

I have a matrix, named "mat", and a smaller matrix, named "center".
temp = c(1.8421,5.6586,6.3526,2.904,3.232,4.6076,4.8,3.2909,4.6122,4.9399)
mat = matrix(temp, ncol=2)
[,1] [,2]
[1,] 1.8421 4.6076
[2,] 5.6586 4.8000
[3,] 6.3526 3.2909
[4,] 2.9040 4.6122
[5,] 3.2320 4.9399
center = matrix(c(3, 6, 3, 2), ncol=2)
[,1] [,2]
[1,] 3 3
[2,] 6 2
I need to compute the distance between each row of mat with every row of center. For example, the distance of mat[1,] and center[1,] can be computed as
diff = mat[1,]-center[1,]
t(diff)%*%diff
[,1]
[1,] 3.92511
Similarly, I can find the distance of mat[1,] and center[2,]
diff = mat[1,]-center[2,]
t(diff)%*%diff
[,1]
[1,] 24.08771
Repeat this process for each row of mat, I will end up with
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836
I know how to implement it with for-loops. I was really hoping someone could tell me how to do it with some kind of an apply() function, maybe mapply() I guess.
Thanks
apply(center, 1, function(x) colSums((x - t(mat)) ^ 2))
# [,1] [,2]
# [1,] 3.925110 24.087710
# [2,] 10.308154 7.956554
# [3,] 11.324550 1.790750
# [4,] 2.608405 16.408805
# [5,] 3.817036 16.304836
If you want the apply for expressiveness of code that's one thing but it's still looping, just different syntax. This can be done without any loops, or with a very small one across center instead of mat. I'd just transpose first because it's wise to get into the habit of getting as much as possible out of the apply statement. (The BrodieG answer is pretty much identical in function.) These are working because R will automatically recycle the smaller vector along the matrix and do it much faster than apply or for.
tm <- t(mat)
apply(center, 1, function(m){
colSums((tm - m)^2) })
Use dist and then extract the relevant submatrix:
ix <- 1:nrow(mat)
as.matrix( dist( rbind(mat, center) )^2 )[ix, -ix]
6 7
# 1 3.925110 24.087710
# 2 10.308154 7.956554
# 3 11.324550 1.790750
# 4 2.608405 16.408805
# 5 3.817036 16.304836
REVISION: simplified slightly.
You could use outer as well
d <- function(i, j) sum((mat[i, ] - center[j, ])^2)
outer(1:nrow(mat), 1:nrow(center), Vectorize(d))
This will solve it
t(apply(mat,1,function(row){
d1<-sum((row-center[1,])^2)
d2<-sum((row-center[2,])^2)
return(c(d1,d2))
}))
Result:
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836

replace diagonal elements in an array

Does anyone know a neat/efficient way to replace diagonal elements in array, similar to the use of diag(x) <- value for a matrix? In other words something like this:
> m<-array(1:27,c(3,3,3))
> for(k in 1:3){
+ diag(m[,,k])<-5
+ }
> m
, , 1
[,1] [,2] [,3]
[1,] 5 4 7
[2,] 2 5 8
[3,] 3 6 5
, , 2
[,1] [,2] [,3]
[1,] 5 13 16
[2,] 11 5 17
[3,] 12 15 5
, , 3
[,1] [,2] [,3]
[1,] 5 22 25
[2,] 20 5 26
[3,] 21 24 5
but without the use of a for loop (my arrays are pretty large and this manipulation will already be within a loop).
Many thanks.
Try this:
with(expand.grid(a = 1:3, b = 1:3), replace(m, cbind(a, a, b), 5))
EDIT:
The question asked for neat/efficient but, of course, those are not the same thing. The one liner here is compact and loop-free but if you are looking for speed I think you will find that the loop in the question is actually the fastest of all the answers.
You can use the following function for that, provided you have only 3 dimensions in your array. You can generalize to more dimensions based on this code, but I'm too lazy to do that for you ;-)
`arraydiag<-` <- function(x,value){
dims <- dim(x)
id <- seq_len(dims[1]) +
dims[2]*(seq_len(dims[2])-1)
id <- outer(id,(seq_len(dims[3])-1)*prod(dims[1:2]),`+`)
x[id] <- value
dim(x) <- dims
x
}
This works like :
m<-array(1:36,c(3,3,4))
arraydiag(m)<-NA
m
Note that, contrary to the diag() function, this function cannot deal with matrices that are not square. You can look at the source code of diag() to find out how to adapt this code in order it does so.
diagArr <-
function (dim)
{
n <- dim[2]
if(dim[1] != n) stop("expecting first two dimensions to be equal")
d <- seq(1, n*n, by=n+1)
as.vector(outer(d, seq(0, by=n*n, length=prod(dim[-1:-2])), "+"))
}
m[diagArr(dim(m))] <- 5
This is written with the intention that it works for dimensions higher than 3 but I haven't tested it in that case. Should be okay though.

Resources