I have created an example below, where I am trying to make a list of each row of a matrix, then use apply().
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- mat[i,]
}
#DO NOT CHANGE THIS:
apply(my_list2[[i]],2,sum)
However the apply() function does not work, giving a dimension error. I understand that apply() is not the best function to use here but it is present in a function that I need so I cannot change that line.
Does anyone have any idea how I can change my "my_list2" to work better? Thank you!
Edit:
Here is an example that works (non reproducible)
Example
Note both the example above and this example have type "list"
This answer addresses "how to properly get a list of matrices", not how to resolve the use of apply.
By default in R, when you subset a matrix to a single column or a single row, it reduces the dimensionality. For instance,
mtx <- matrix(1:6, nrow = 2)
mtx
# [,1] [,2] [,3]
# [1,] 1 3 5
# [2,] 2 4 6
mtx[1,]
# [1] 1 3 5
mtx[,3]
# [1] 5 6
If you want a single row or column but to otherwise retain dimensionality, add the drop=FALSE argument to the [-subsetting:
mtx[1,,drop=FALSE]
# [,1] [,2] [,3]
# [1,] 1 3 5
mtx[,3,drop=FALSE]
# [,1]
# [1,] 5
# [2,] 6
In this way, your code to produce sample data can be adjusted to be:
set.seed(42) # important for reproducibility in questions on SO
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- mat[i,,drop=FALSE]
}
my_list2
# [[1]]
# [,1] [,2] [,3]
# [1,] 1.983368 0.381919 3.139846
# [[2]]
# [,1] [,2] [,3]
# [1,] 6.608953 4.731766 4.101296
# [[3]]
# [,1] [,2] [,3]
# [1,] 2.83491 14.63627 11.91598
And then you can use akrun's most recent code to resolve how to get the row-wise sums within each list element, i.e., one of
lapply(my_list2, apply, 2, sum)
lapply(my_list2, function(z) apply(z, 2, sum))
lapply(my_list2, \(z) apply(z, 2, sum)) # R-4.1 or later
In your screenshot it works because the object part of the list ex[[1]] is an array. And in your example the elements of your list are vectors. You could try the following:
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- as.matrix(mat[i,])
}
#DO NOT CHANGE THIS:
apply(my_list2[[1]],2,sum)
apply(my_list2[[2]],2,sum)
apply(my_list2[[3]],2,sum)
You should note that apply cannot be applied to all three elements of the array in one line. And to do it in one, that line should be changed.
Related
I want to compute cumulative sum for the first (n-1) columns(if we have n columns matrix) and subsequently average the values. I created a sample matrix to do this task. I have the following matrix
ma = matrix(c(1:10), nrow = 2, ncol = 5)
ma
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
I wanted to find the following
ans = matrix(c(1,2,2,3,3,4,4,5), nrow = 2, ncol = 4)
ans
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
The following are my r function.
ColCumSumsAve <- function(y){
for(i in seq_len(dim(y)[2]-1)) {
y[,i] <- cumsum(y[,i])/i
}
}
ColCumSumsAve(ma)
However, when I run the above function its not producing any output. Are there any mistakes in the code?
Thanks.
There were several mistakes.
Solution
This is what I tested and what works:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
res[, 1:(ncol(m)-1)]
}
Test it with:
> colCumSumAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
which is correct.
Explanation:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum)) # calculate row-wise colsum
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
# This is the trickiest part.
# Because `csum` is a matrix, the matrix will be treated like a vector
# when `Reduce`-ing using `/` with a vector `1:ncol(m)`.
# To get quasi-row-wise treatment, I change orientation
# of the matrix by `t()`.
# However, the output, the output will be in this transformed
# orientation as a consequence. So I re-transform by applying `t()`
# on the entire result at the end - to get again the original
# input matrix orientation.
# `Reduce` using `/` here by sequencial list of the `t(csum)` and
# `1:ncol(m)` finally, has as effect `/`-ing `csum` values by their
# corresponding column position.
res[, 1:(ncol(m)-1)] # removes last column for the answer.
# this, of course could be done right at the beginning,
# saving calculation of values in the last column,
# but this calculation actually is not the speed-limiting or speed-down-slowing step
# of these calculations (since this is sth vectorized)
# rather the `apply` and `Reduce` will be rather speed-limiting.
}
Well, okay, I could do then:
colCumSumAve <- function(m) {
csum <- t(apply(X=m[, 1:(ncol(m)-1)], MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
or:
colCumSumAve <- function(m) {
m <- m[, 1:(ncol(m)-1)] # remove last column
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
This is actually the more optimized solution, then.
Original Function
Your original function makes only assignments in the for-loop and doesn't return anything.
So I copied first your input into a res, processed it with your for-loop and then returned res.
ColCumSumsAve <- function(y){
res <- y
for(i in seq_len(dim(y)[2]-1)) {
res[,i] <- cumsum(y[,i])/i
}
res
}
However, this gives:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1.5 1.666667 1.75 9
[2,] 3 3.5 3.666667 3.75 10
The problem is that the cumsum in matrices is calculated in column-direction instead row-wise, since it treats the matrix like a vector (which goes columnwise through the matrix).
Corrected Original Function
After some frickeling, I realized, the correct solution is:
ColCumSumsAve <- function(y){
res <- matrix(NA, nrow(y), ncol(y)-1)
# create empty matrix with the dimensions of y minus last column
for (i in 1:(nrow(y))) { # go through rows
for (j in 1:(ncol(y)-1)) { # go through columns
res[i, j] <- sum(y[i, 1:j])/j # for each position do this
}
}
res # return `res`ult by calling it at the end!
}
with the testing:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
Note: dim(y)[2] is ncol(y) - and dim(y)[1] is nrow(y) -
and instead seq_len(), 1: is shorter and I guess even slightly faster.
Note: My solution given first will be faster, since it uses apply, vectorized cumsum and Reduce. - for-loops in R are slower.
Late Note: Not so sure that the first solution is faster. Since R-3.x it seems that for loops are faster. Reduce will be the speed limiting funtion and can be sometimes incredibly slow.
k <- t(apply(ma,1,cumsum))[,-ncol(k)]
for (i in 1:ncol(k)){
k[,i] <- k[,i]/i
}
k
This should work.
All you need is rowMeans:
nc <- 4
cbind(ma[,1],sapply(2:nc,function(x) rowMeans(ma[,1:x])))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
Here's how I did it
> t(apply(ma, 1, function(x) cumsum(x) / 1:length(x)))[,-NCOL(ma)]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
This applies the cumsum function row-wise to the matrix ma and then divides by the correct length to get the average (cumsum(x) and 1:length(x) will have the same length). Then simply transpose with t and remove the last column with [,-NCOL(ma)].
The reason why there is no output from your function is because you aren't returning anything. You should end the function with return(y) or simply y as Marius suggested. Regardless, your function doesn't seem to give you the correct response anyway.
I´m trying to get different elements from multiple diagonal saved as lists. My data looks something like this:
res <- list()
res[[1]] <- matrix(c(0.04770856,0.02854005,0.02854005,0.03260190), nrow=2, ncol=2)
res[[2]] <- matrix(c(0.05436957,0.04887182,0.04887182, 0.10484454), nrow=2, ncol=2)
> res
[[1]]
[,1] [,2]
[1,] 0.04770856 0.02854005
[2,] 0.02854005 0.03260190
[[2]]
[,1] [,2]
[1,] 0.05436957 0.04887182
[2,] 0.04887182 0.10484454
> diag(res[[1]])
[1] 0.04770856 0.03260190
> diag(res[[2]])
[1] 0.05436957 0.10484454
I would like to save the first and second elements of each diagonal of a given list into a vector similar to this:
d.1st.el <- c(0.04770856, 0.05436957)
d.2nd.el <- c(0.03260190, 0.10484454)
My issue is to write the function that runs for all given lists and get the diagonals. For some reason, when I use unlist() to extract the values of each matrix for a given level, it doesn't get me the number but the full matrix.
Does anyone have a simple solution?
sapply(res, diag)
[,1] [,2]
[1,] 0.04770856 0.05436957
[2,] 0.03260190 0.10484454
# or
lapply(res, diag)
[[1]]
[1] 0.04770856 0.03260190
[[2]]
[1] 0.05436957 0.10484454
If you want the vectors for some reason in your global environment:
alld <- lapply(res, diag)
names(alld) <- sprintf("d.%d.el", 1:length(alld))
list2env(alld, globalenv())
In two steps you can do:
# Step 1 - Get the diagonals
all_diags <- sapply(res, function(x) diag(t(x)))
print(all_diags)
[,1] [,2]
[1,] 0.04770856 0.05436957
[2,] 0.03260190 0.10484454
# Step 2 - Append to vectors
d.1st.el <- all_diags[1,]
d.2nd.el <- all_diags[2,]
I have a similar situation like this:
set.seed(2014)
df<-data.frame(
group=rbinom(100,1,0.6),
y1=rbinom(100,1,0.3),
y2=rbinom(100,1,0.8))
for (y in c("y1","y1")){
test<-summary(table(df[,"group"],df[,y]))
output<-do.call(rbind,list(cbind(test$statistic,test$p.value)))
}
output
[,1] [,2]
[1,] 1.066 0.3019
I'm wondering why it's not an output as I expected:
output
[,1] [,2]
[1,] 1.066 0.3019
[2,] 0.00011 1
In each iteration of the loop (you've used y1 twice) output is overwritten by a new value. Presumably you were aiming for soemthing like:
set.seed(2014)
df<-data.frame(
group=rbinom(100,1,0.6),
y1=rbinom(100,1,0.3),
y2=rbinom(100,1,0.8))
output <- NULL
for (y in c("y1","y2")){
test<-summary(table(df[,"group"],df[,y]))
output<-rbind(output,cbind(test$statistic,test$p.value))
}
output
2 issues: you are looping over y1 twice, and you are not appending your new result to the older one. I think you want to loop using lapply and rbind that list:
do.call(rbind,lapply(c("y1","y2"),
function (y) summary(table(df[,"group"],df[,y]))))[,c("statistic","p.value")]
statistic p.value
[1,] 1.065739 0.30191
[2,] 0.000106695 0.9917585
You basically do this twice:
y <- "y1"
test<-summary(table(df[,"group"],df[,y]))
myList <- list(cbind(test$statistic,test$p.value))
#[[1]]
# [,1] [,2]
#[1,] 1.065739 0.30191
See how there is only one element in the list? This element is passed to rbind:
do.call(rbind, myList)
# [,1] [,2]
#[1,] 1.065739 0.30191
rbind(myList[[1]])
# [,1] [,2]
#[1,] 1.065739 0.30191
Is there an easy way to convert a correlation-covariance matrix into a variance-covariance matrix? I always use nested for-loops as below, but I keep thinking there is probably a built-in function in base R.
my.matrix <- matrix(c(0.64901, 0.76519, -0.63620, -0.01923,
0.02114, 0.00118, -0.43198, 0.02480,
-0.21811, -0.00630, 0.18109, 0.05964,
-0.00710, 0.00039, 0.01162, 0.20972), nrow=4, byrow=TRUE)
new.matrix <- my.matrix
for(i in 1:nrow(my.matrix)) {
for(j in 1:ncol(my.matrix)) {
new.matrix[i,j] = ifelse(i<j, my.matrix[j,i], new.matrix[i,j])
}
}
new.matrix
# [,1] [,2] [,3] [,4]
# [1,] 0.64901 0.02114 -0.21811 -0.00710
# [2,] 0.02114 0.00118 -0.00630 0.00039
# [3,] -0.21811 -0.00630 0.18109 0.01162
# [4,] -0.00710 0.00039 0.01162 0.20972
I am aware of the lower.tri and upper.tri functions, but cannot seem to accomplish the task with a combination of them and t().
I think you might need to get the indices with which and then swap the rows and columns. Try this.
k <- which(lower.tri(my.matrix), arr.ind=TRUE)
my.matrix[k[,c(2,1)]] <- my.matrix[k]
For example: I have a list of matrices, and I would like to evaluate their differences, sort of a 3-D diff. So if I have:
m1 <- matrix(1:4, ncol=2)
m2 <- matrix(5:8, ncol=2)
m3 <- matrix(9:12, ncol=2)
mat.list <- list(m1,m2,m3)
I want to obtain
mat.diff <- list(m2-m1, m3-m2)
The solution I found is the following:
mat.diff <- mapply(function (A,B) B-A, mat.list[-length(mat.list)], mat.list[-1])
Is there a nicer/built-in way to do this?
You can do this with just lapply or other ways of looping:
mat.diff <- lapply( tail( seq_along(mat.list), -1 ),
function(i) mat.list[[i]] - mat.list[[ i-1 ]] )
You can use combn to generate the indexes of matrix and apply a function on each combination.
combn(1:length(l),2,FUN=function(x)
if(diff(x) == 1) ## apply just for consecutive index
l[[x[2]]]-l[[x[1]]],
simplify = FALSE) ## to get a list
Using #Arun data, I get :
[[1]]
[,1] [,2]
[1,] 4 4
[2,] 4 4
[[2]]
NULL
[[3]]
[,1] [,2]
[1,] 4 4
[2,] 4 4