I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.
Related
Let's assume I have a list of four data.frames containing some NA values:
my.list<-replicate(4,data.frame())
names(my.list)<-paste0("Frame.Number", c(1:4))
for (i in 1:4){
my.list[[i]]<-mapply(rnorm,10,c(1:4))
my.list[[i]][i+1,3]<-NA
my.list[[i]][c(i,i*2),4]<-NA
}
For each of the data.frames, I want to select those rows which don't contain NAs in the 4th column. I can, for example, create a list of vectors (?) containing information about the completeliness of the cases in each data.frame:
selector <- lapply(my.list,"[",, 4)
selector <- lapply(selector,complete.cases)
Now this is where I am stuck: how do I apply the selector list to the my.list list in order to select only the complete cases? I thought I could use lapply again, but I cannot figure out some meaningful syntax.
We can lapply over list, select the 4th column and get the index of non-NA values and subset the matrix accordingly.
lapply(my.list,function(x) x[!is.na(x[,4]), ])
#$Frame.Number1
# [,1] [,2] [,3] [,4]
#[1,] 0.3668229 2.0688573 2.466580 4.339755
#[2,] -0.6391422 3.2635271 2.011809 3.296089
#[3,] 0.8662670 2.2797301 4.838563 4.443876
#[4,] -0.8972108 2.9305257 3.461650 5.525453
#[5,] -0.3452349 -0.2211153 2.570717 3.915671
#[6,] 0.6464616 2.3472838 4.009406 3.436188
#[7,] 0.9341354 2.3092428 2.338770 4.359324
#[8,] -0.5652311 3.2143472 1.944220 4.042566
#$Frame.Number2
# [,1] [,2] [,3] [,4]
#[1,] 0.22304364 2.6085569 3.459335 2.575920
#[2,] -0.08987518 2.9515099 NA 3.775579
#[3,] 2.03265254 0.9405609 3.266783 4.009509
....
This question already has an answer here:
Solve homogenous system Ax = 0 for any m * n matrix A in R (find null space basis for A)
(1 answer)
Closed 4 years ago.
I am using the pracma package, which contains the function nullspace(), returning normalized basis vectors of the Null(A):
> require(pracma)
> (A = matrix(c(1,2,3,4,5,6), nrow=2, byrow=T))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> nullspace(A)
[,1]
[1,] 0.4082483
[2,] -0.8164966
[3,] 0.4082483
which is perfectly fine. However (don't ask), I want to quickly check the values I'd get if I were to produce the reduced row echelon form:
> rref(A)
[,1] [,2] [,3]
[1,] 1 0 -1
[2,] 0 1 2
and from there "manually" figure out the null space as
N(A) = [1, -2, 1]'
Yes, the latter is a scalar multiple of the former:
> c(1,-2,1)/nullspace(A)
[,1]
[1,] 2.44949
[2,] 2.44949
[3,] 2.44949
but I'd still like to get the latter, non-normalized form of a basis of the null space, as though the values were directly obtained from the reduced row echelon matrix.
You may want to try
B = rref(A)
solve(B[,1:2], -B[,3])
This gives you the combination your need for the first two columns to get one unit of the third column. Just add one to get your result.
Similarly for the case where size of null space is larger than one.
I thought this would be easy but it is not working for. I am trying to follow this example to change the column names for each matrix in a list of matrices that I created:
Assign column names to list of dataframes
When I run the code below, I get a very weird return where it looks like I just set the name of each element in each matrix instead of just the column names.
#create a list of matrices containing random numbers
randoms<-lapply(1:1000, function(x) matrix(rnorm(1440), ncol=10))
trial<-lapply(randoms, setNames , nm = letters[1:10])
head(trial[[1]])
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.89032453 1.02459736 0.7141343 -0.47405630 -2.0719943 -1.5087669
[2,] -0.74866047 0.44086093 -1.7540066 -2.04227094 -0.4875453 1.4207707
[3,] -0.04565454 -1.52336294 -0.1941370 -1.36252338 1.7338307 -1.3536725
[4,] 0.13242099 -0.09157545 -0.6156536 -1.34546174 -0.3279853 0.9663668
[5,] 2.09173141 0.41592339 0.7422889 -0.05991624 0.5319697 0.6413341
[6,] -0.32129540 2.11206231 0.1722047 -0.54404820 1.2685971 -0.0784607
[,7] [,8] [,9] [,10]
[1,] -0.4849624 -1.2590439 -1.5066718 -0.6758746
[2,] -2.5010320 -2.3469163 0.5221117 0.9186142
[3,] -1.3763468 -0.5551194 -0.2304872 -1.6087508
[4,] -2.0282231 -0.1949064 0.9329241 1.0196325
[5,] 1.6429999 1.8176161 -0.6549447 -1.8833887
[6,] 1.0044023 1.5895154 0.3660308 -0.1883819
head(attr(trial[[1]], "names"))
[1] "a" "b" "c" "d" "e" "f"
We can use a for loop
for(i in seq_along(randoms)) {
colnames(randoms[[i]]) <- letters[1:10]
}
I have a list of three data frames, each with n columns (6 in this case) and r rows (3 in this case). I want to create a matrix with the same dimensions (n*r) of the average of the three data frames within the list. So, for instance, [1,1] of the output matrix should be the average of 0.2470748, 0.2558439 and 0.2439057. Any ideas on how to do this?
We can use Reduce
Reduce(`+`, lis)/length(lis)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
Or another option is apply (as mentioned in the comments by #Ananda Mahto)
apply(simplify2array(lis), c(1,2), mean)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
The advantage of mean function is that if there are NA values, we can use na.rm=TRUE as argument.
I have a list of matrices with identical dimensions, for example:
mat.list=rep(list(matrix(rnorm(n=12,mean=1,sd=1), nrow = 3, ncol=4)),3)
I'm looking for an efficient way to retrieve a column from each matrix in the list where the column index of interest from each matrix is specified by a vector. For example, for this vector of column indices:
idx.vec=c(3,2,3)
I would like to obtain column 3 from matrix 1, column 2 from matrix 2, and column 3 from matrix 3, as a matrix so that this matrix dimensions are the number of rows of the matrices in the list by the number of matrices in the list.
For this example the result would therefore be:
cbind(mat.list[[1]][,3],mat.list[[2]][,2],mat.list[[3]][,3])
[,1] [,2] [,3]
[1,] 1.4852810 1.305448 1.4852810
[2,] 1.8647327 -1.237507 1.8647327
[3,] -0.0416013 2.156055 -0.0416013
One possible approach would be mapply('[', mat.list, TRUE, idx.vec). The trick is to use '[' for subsetting and TRUE as an argument to select all the rows. Here is how it works:
'['(matrix(1:4, ncol = 2), TRUE, 2)
# [1] 3 4
Another (ugly) approach would be lapply(mat.list, "[",,idx.vec)[[1]]:
> set.seed(1)
> mat.list=rep(list(matrix(rnorm(n=12,mean=1,sd=1), nrow = 3, ncol=4)),3)
> idx.vec=c(3,2,3)
> lapply(mat.list, "[",,idx.vec)[[1]]
[,1] [,2] [,3]
[1,] 1.487429 2.5952808 1.487429
[2,] 1.738325 1.3295078 1.738325
[3,] 1.575781 0.1795316 1.575781