I have a list of three data frames, each with n columns (6 in this case) and r rows (3 in this case). I want to create a matrix with the same dimensions (n*r) of the average of the three data frames within the list. So, for instance, [1,1] of the output matrix should be the average of 0.2470748, 0.2558439 and 0.2439057. Any ideas on how to do this?
We can use Reduce
Reduce(`+`, lis)/length(lis)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
Or another option is apply (as mentioned in the comments by #Ananda Mahto)
apply(simplify2array(lis), c(1,2), mean)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
The advantage of mean function is that if there are NA values, we can use na.rm=TRUE as argument.
Related
Let's assume I have a list of four data.frames containing some NA values:
my.list<-replicate(4,data.frame())
names(my.list)<-paste0("Frame.Number", c(1:4))
for (i in 1:4){
my.list[[i]]<-mapply(rnorm,10,c(1:4))
my.list[[i]][i+1,3]<-NA
my.list[[i]][c(i,i*2),4]<-NA
}
For each of the data.frames, I want to select those rows which don't contain NAs in the 4th column. I can, for example, create a list of vectors (?) containing information about the completeliness of the cases in each data.frame:
selector <- lapply(my.list,"[",, 4)
selector <- lapply(selector,complete.cases)
Now this is where I am stuck: how do I apply the selector list to the my.list list in order to select only the complete cases? I thought I could use lapply again, but I cannot figure out some meaningful syntax.
We can lapply over list, select the 4th column and get the index of non-NA values and subset the matrix accordingly.
lapply(my.list,function(x) x[!is.na(x[,4]), ])
#$Frame.Number1
# [,1] [,2] [,3] [,4]
#[1,] 0.3668229 2.0688573 2.466580 4.339755
#[2,] -0.6391422 3.2635271 2.011809 3.296089
#[3,] 0.8662670 2.2797301 4.838563 4.443876
#[4,] -0.8972108 2.9305257 3.461650 5.525453
#[5,] -0.3452349 -0.2211153 2.570717 3.915671
#[6,] 0.6464616 2.3472838 4.009406 3.436188
#[7,] 0.9341354 2.3092428 2.338770 4.359324
#[8,] -0.5652311 3.2143472 1.944220 4.042566
#$Frame.Number2
# [,1] [,2] [,3] [,4]
#[1,] 0.22304364 2.6085569 3.459335 2.575920
#[2,] -0.08987518 2.9515099 NA 3.775579
#[3,] 2.03265254 0.9405609 3.266783 4.009509
....
I thought this would be easy but it is not working for. I am trying to follow this example to change the column names for each matrix in a list of matrices that I created:
Assign column names to list of dataframes
When I run the code below, I get a very weird return where it looks like I just set the name of each element in each matrix instead of just the column names.
#create a list of matrices containing random numbers
randoms<-lapply(1:1000, function(x) matrix(rnorm(1440), ncol=10))
trial<-lapply(randoms, setNames , nm = letters[1:10])
head(trial[[1]])
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.89032453 1.02459736 0.7141343 -0.47405630 -2.0719943 -1.5087669
[2,] -0.74866047 0.44086093 -1.7540066 -2.04227094 -0.4875453 1.4207707
[3,] -0.04565454 -1.52336294 -0.1941370 -1.36252338 1.7338307 -1.3536725
[4,] 0.13242099 -0.09157545 -0.6156536 -1.34546174 -0.3279853 0.9663668
[5,] 2.09173141 0.41592339 0.7422889 -0.05991624 0.5319697 0.6413341
[6,] -0.32129540 2.11206231 0.1722047 -0.54404820 1.2685971 -0.0784607
[,7] [,8] [,9] [,10]
[1,] -0.4849624 -1.2590439 -1.5066718 -0.6758746
[2,] -2.5010320 -2.3469163 0.5221117 0.9186142
[3,] -1.3763468 -0.5551194 -0.2304872 -1.6087508
[4,] -2.0282231 -0.1949064 0.9329241 1.0196325
[5,] 1.6429999 1.8176161 -0.6549447 -1.8833887
[6,] 1.0044023 1.5895154 0.3660308 -0.1883819
head(attr(trial[[1]], "names"))
[1] "a" "b" "c" "d" "e" "f"
We can use a for loop
for(i in seq_along(randoms)) {
colnames(randoms[[i]]) <- letters[1:10]
}
I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.
Is there a way to combine columns of a matrix as per the below:
Input:
m1
[,1] [,2] [,3] [,4]
S 121.0000000 100.0000000 100.0000000 82.6446281
P 0.5224135 0.1790449 0.1737533 0.1247883
Output:
m2
[,1] [,2] [,3]
S 121.0000000 100.0000000 82.6446281
P 0.5224135 0.3527982 0.1247883
I need to combine the elements of row 2 based on equal elements in row 1.
In this case, m2[2,2] = m1[2,2] + m1[2,3]
Thanks
In base R , using aggregate for example :
t(aggregate(P~S,t(dat),sum))
[,1] [,2] [,3]
S 82.6446281 100.0000000 121.0000000
P 0.1247883 0.3527982 0.5224135
Note the use of transpose here , In general it is easier to group by column than by rows.
This would fully implement and demonstrate #jbaums suggestion:
> rbind(sort( m1[1, !duplicated (m1[1,])] ), tapply(m1[2, ], m1[1, ], sum))
82.6446281 100 121
[1,] 82.6446281 100.0000000 121.0000000
[2,] 0.1247883 0.3527982 0.5224135
This question already has answers here:
How to increase the number of columns using R in Linux
(6 answers)
Closed 9 years ago.
I use R in terminal on a widescreen monitor. When I type print(matrix(rnorm(10*10),ncol=10)), it'll print 5 columns on top of each other, whereas I want it to print all 10 columns across the widescreen. Even when I have the terminal maximized, it does not take up the additional widescreen room for printing this matrix.
I am using Ubuntu and base R.
You can use this function:
wideScreen <- function(howWide=Sys.getenv("COLUMNS")) {
options(width=as.integer(howWide))
}
and then set a value to wideScreen(1000)
source: How to increase the number of columns using R in Linux
And here is a code to automatically set it on start up:
https://github.com/brendano/dlanalysis/blob/master/util.R
See here:
options(width=111)
print(matrix(rnorm(10*10), ncol=10))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
...
[,10]
...
options(width=130)
print(matrix(rnorm(10*10), ncol=10))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
...