Is there a way to combine columns of a matrix as per the below:
Input:
m1
[,1] [,2] [,3] [,4]
S 121.0000000 100.0000000 100.0000000 82.6446281
P 0.5224135 0.1790449 0.1737533 0.1247883
Output:
m2
[,1] [,2] [,3]
S 121.0000000 100.0000000 82.6446281
P 0.5224135 0.3527982 0.1247883
I need to combine the elements of row 2 based on equal elements in row 1.
In this case, m2[2,2] = m1[2,2] + m1[2,3]
Thanks
In base R , using aggregate for example :
t(aggregate(P~S,t(dat),sum))
[,1] [,2] [,3]
S 82.6446281 100.0000000 121.0000000
P 0.1247883 0.3527982 0.5224135
Note the use of transpose here , In general it is easier to group by column than by rows.
This would fully implement and demonstrate #jbaums suggestion:
> rbind(sort( m1[1, !duplicated (m1[1,])] ), tapply(m1[2, ], m1[1, ], sum))
82.6446281 100 121
[1,] 82.6446281 100.0000000 121.0000000
[2,] 0.1247883 0.3527982 0.5224135
Related
Let's assume I have a list of four data.frames containing some NA values:
my.list<-replicate(4,data.frame())
names(my.list)<-paste0("Frame.Number", c(1:4))
for (i in 1:4){
my.list[[i]]<-mapply(rnorm,10,c(1:4))
my.list[[i]][i+1,3]<-NA
my.list[[i]][c(i,i*2),4]<-NA
}
For each of the data.frames, I want to select those rows which don't contain NAs in the 4th column. I can, for example, create a list of vectors (?) containing information about the completeliness of the cases in each data.frame:
selector <- lapply(my.list,"[",, 4)
selector <- lapply(selector,complete.cases)
Now this is where I am stuck: how do I apply the selector list to the my.list list in order to select only the complete cases? I thought I could use lapply again, but I cannot figure out some meaningful syntax.
We can lapply over list, select the 4th column and get the index of non-NA values and subset the matrix accordingly.
lapply(my.list,function(x) x[!is.na(x[,4]), ])
#$Frame.Number1
# [,1] [,2] [,3] [,4]
#[1,] 0.3668229 2.0688573 2.466580 4.339755
#[2,] -0.6391422 3.2635271 2.011809 3.296089
#[3,] 0.8662670 2.2797301 4.838563 4.443876
#[4,] -0.8972108 2.9305257 3.461650 5.525453
#[5,] -0.3452349 -0.2211153 2.570717 3.915671
#[6,] 0.6464616 2.3472838 4.009406 3.436188
#[7,] 0.9341354 2.3092428 2.338770 4.359324
#[8,] -0.5652311 3.2143472 1.944220 4.042566
#$Frame.Number2
# [,1] [,2] [,3] [,4]
#[1,] 0.22304364 2.6085569 3.459335 2.575920
#[2,] -0.08987518 2.9515099 NA 3.775579
#[3,] 2.03265254 0.9405609 3.266783 4.009509
....
I have two matrices:For example
temp1 <- matrix(c(1,2,3,4,5,6),2,3,byrow = T)
temp2 <- matrix(c(7,8,9),1,3,byrow = T)
temp1
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
temp2
[,1] [,2] [,3]
[1,] 7 8 9
I have two matrices with the same number of rows, but with different rows. I would like to add these two matrices as follows. I wonder if there is a way to add R without for statements and apply functions.
temp <- do.call(rbind,lapply(1:2,function(x){temp[x,]+temp2}))
temp
[,1] [,2] [,3]
[1,] 8 10 12
[2,] 11 13 15
This example is simple, but in practice I need to do the above with a 100 * 100 matrix and a 1 * 100 matrix. In this case, it takes too long, so I do not want to use for statements and apply functions.
You can use ?sweep:
temp1 <- matrix(c(1,2,3,4,5,6),2,3,byrow = T)
temp2 <- matrix(c(7,8,9),1,3,byrow = T)
sweep(temp1, 2, temp2, '+')
Unfortunately the help for sweep is really difficult to understand, but in this example you apply the function ´+´ with argument ´temp2´ along the second dimension of temp1.
For more examples, see: How to use the 'sweep' function
I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.
I have a list of three data frames, each with n columns (6 in this case) and r rows (3 in this case). I want to create a matrix with the same dimensions (n*r) of the average of the three data frames within the list. So, for instance, [1,1] of the output matrix should be the average of 0.2470748, 0.2558439 and 0.2439057. Any ideas on how to do this?
We can use Reduce
Reduce(`+`, lis)/length(lis)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
Or another option is apply (as mentioned in the comments by #Ananda Mahto)
apply(simplify2array(lis), c(1,2), mean)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
The advantage of mean function is that if there are NA values, we can use na.rm=TRUE as argument.
I've got a list called res that looks like this:
[[1]]
[,1] [,2]
[1,] 275.0637 273.9386
[2,] 5.707791 5.755798
[[2]]
[,1] [,2]
[1,] 126.8435 59.08806
[2,] 4.867521 3.258545
[[3]]
[,1] [,2]
[1,] 23.50188 60.96321
[2,] 2.036354 3.737291
The list contains results from a simulation run a total of 6 times. I set a parameter of interest at three different values, '0' (ie., [[1]]), '25' (i.e.,[[2]]), and '50' (i.e.,[[3]]). Since the model includes a great deal of randomness I ran the model twice for each value (i.e., [,1], [,2]). I asked the model to record two results, 'time feeding' (i.e., [1,] and 'distance traveled' (i.e., [2,]) for each iteration. Ultimately I will iterate the model 30 times for each variable setting. I'd like to use ggplot to create a boxplot showing 'time feeding' and 'distance traveled' for each of the three simulation settings (i.e., 0,25,50). I believe ggplot can't plot a list so I tried to convert res to a dataframe using res2 <- data.frame(res) which looked like:
X1 X2 X1.1 X2.1 X1.2 X2.2
1 275.0637 273.9386 126.8435 59.08806 23.50188 60.96321
2 5.707791 5.755798 4.867521 3.258545 2.036354 3.737291
This doesn't quite look right to me because now the results from all three simulations are on the same row. Any help on bringing this data into ggplot to create a boxplot with would be really helpful. Thanks in advance!
--Neil
Assuming ll is your list , you can use do.call and rbind like this :
do.call(rbind,lapply(seq_along(ll),
function(x)data.frame(ll[[x]],iter=x)))
X..1. X..2. iter
[1,] 275.063700 273.938600 1
[2,] 5.707791 5.755798 1
[1,]1 126.843500 59.088060 2
[2,]1 4.867521 3.258545 2
[1,]2 23.501880 60.963210 3
[2,]2 2.036354 3.737291 3
EDIT after op clarication:
interest <- c(0,25,50)
do.call(rbind,lapply(seq_along(ll),
function(x)data.frame(x= unlist(ll[[x]]),interst=interest[x])))
interst=interest[x] .... [TRUNCATED]
x interst
X..1.1 275.063700 0
X..1.2 5.707791 0
X..2.1 273.938600 0
X..2.2 5.755798 0
X..1.11 126.843500 25
X..1.21 4.867521 25
X..2.11 59.088060 25
X..2.21 3.258545 25
X..1.12 23.501880 50
X..1.22 2.036354 50
X..2.12 60.963210 50
X..2.22 3.737291 50
EDIT since OP don't provide data here ll :
res <- list(read.table(text='
[,1] [,2]
[1,] 275.0637 273.9386
[2,] 5.707791 5.755798'),
read.table(text='
[,1] [,2]
[1,] 126.8435 59.08806
[2,] 4.867521 3.258545'),
read.table(text='
[,1] [,2]
[1,] 23.50188 60.96321
[2,] 2.036354 3.737291'))
I would do
names(res) = c("0", "25", "50")
m = reshape2::melt(res, id = 1)
but maybe it doesn't work, I tried it in my head because you didn't provide data in usable form.