subsetting elements of a list by elements of another list - r

Let's assume I have a list of four data.frames containing some NA values:
my.list<-replicate(4,data.frame())
names(my.list)<-paste0("Frame.Number", c(1:4))
for (i in 1:4){
my.list[[i]]<-mapply(rnorm,10,c(1:4))
my.list[[i]][i+1,3]<-NA
my.list[[i]][c(i,i*2),4]<-NA
}
For each of the data.frames, I want to select those rows which don't contain NAs in the 4th column. I can, for example, create a list of vectors (?) containing information about the completeliness of the cases in each data.frame:
selector <- lapply(my.list,"[",, 4)
selector <- lapply(selector,complete.cases)
Now this is where I am stuck: how do I apply the selector list to the my.list list in order to select only the complete cases? I thought I could use lapply again, but I cannot figure out some meaningful syntax.

We can lapply over list, select the 4th column and get the index of non-NA values and subset the matrix accordingly.
lapply(my.list,function(x) x[!is.na(x[,4]), ])
#$Frame.Number1
# [,1] [,2] [,3] [,4]
#[1,] 0.3668229 2.0688573 2.466580 4.339755
#[2,] -0.6391422 3.2635271 2.011809 3.296089
#[3,] 0.8662670 2.2797301 4.838563 4.443876
#[4,] -0.8972108 2.9305257 3.461650 5.525453
#[5,] -0.3452349 -0.2211153 2.570717 3.915671
#[6,] 0.6464616 2.3472838 4.009406 3.436188
#[7,] 0.9341354 2.3092428 2.338770 4.359324
#[8,] -0.5652311 3.2143472 1.944220 4.042566
#$Frame.Number2
# [,1] [,2] [,3] [,4]
#[1,] 0.22304364 2.6085569 3.459335 2.575920
#[2,] -0.08987518 2.9515099 NA 3.775579
#[3,] 2.03265254 0.9405609 3.266783 4.009509
....

Related

Un-normalized null space basis of a matrix in R [duplicate]

This question already has an answer here:
Solve homogenous system Ax = 0 for any m * n matrix A in R (find null space basis for A)
(1 answer)
Closed 4 years ago.
I am using the pracma package, which contains the function nullspace(), returning normalized basis vectors of the Null(A):
> require(pracma)
> (A = matrix(c(1,2,3,4,5,6), nrow=2, byrow=T))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> nullspace(A)
[,1]
[1,] 0.4082483
[2,] -0.8164966
[3,] 0.4082483
which is perfectly fine. However (don't ask), I want to quickly check the values I'd get if I were to produce the reduced row echelon form:
> rref(A)
[,1] [,2] [,3]
[1,] 1 0 -1
[2,] 0 1 2
and from there "manually" figure out the null space as
N(A) = [1, -2, 1]'
Yes, the latter is a scalar multiple of the former:
> c(1,-2,1)/nullspace(A)
[,1]
[1,] 2.44949
[2,] 2.44949
[3,] 2.44949
but I'd still like to get the latter, non-normalized form of a basis of the null space, as though the values were directly obtained from the reduced row echelon matrix.
You may want to try
B = rref(A)
solve(B[,1:2], -B[,3])
This gives you the combination your need for the first two columns to get one unit of the third column. Just add one to get your result.
Similarly for the case where size of null space is larger than one.

Test if a pattern is present in a list ; If Yes replace an element in another list

In R, I want to test if a pattern is present in a list, to replace an element located at the position in another list.
Let me take an example. My first list looks like this:
table 1:
[,1] [,2]
[1,]ABBABBCA
[2,]ABBUCCCH
My second list looks like this:
[,1] [,2]
[1,]KIGSPLOM
[2,]ANAMAKAM
I want to test the condition, if the pattern "KI" is present in my second list then replace the element at the same place in my first list.
In this case, KI is present in my second list in "KIGS" and I would replace "ABBA" by "KI". So in position [1,1] in both lists.
Is there a way to easily do that in Rand obtain the following list:
[,1] [,2]
[1,]KI BBCA
[2,]ABBUCCCH
If I understand correctly your answer, a possible solution is using the command ifelse in relation with grepl. But, Do you have two matrix or two list?
A list is a generic vector containing other objects, it's different from your example (read this site http://www.programcreek.com/2014/01/vector-array-list-and-data-frame-in-r/).
Then:
The two matrix as examples:
table1<-matrix(c("ABBA","BBCA","ABBU","CCCH"),nrow=2,ncol=2,byrow=TRUE)
table2<-matrix(c("KIGS","BBCA","ABBU","CCCH"),nrow=2,ncol=2,byrow=TRUE)
table1
[,1] [,2]
[1,] "ABBA" "BBCA"
[2,] "ABBU" "CCCH"
table2
[,1] [,2]
[1,] "KIGS" "BBCA"
[2,] "ABBU" "CCCH"
I used a for loop with ifelse, conditional element selection, and grepl command as:
for(j in 1:ncol(table2)){
for(i in 1:nrow(table2)) {
table2[i,j]<-ifelse(grepl("KI",table2[i,j])==TRUE,table1[i,j],table2[i,j])
}}
then we have table2 as:
table2
[,1] [,2]
[1,] "ABBA" "BBCA"
[2,] "ABBU" "CCCH"

R: change column names of every matrix in a list

I thought this would be easy but it is not working for. I am trying to follow this example to change the column names for each matrix in a list of matrices that I created:
Assign column names to list of dataframes
When I run the code below, I get a very weird return where it looks like I just set the name of each element in each matrix instead of just the column names.
#create a list of matrices containing random numbers
randoms<-lapply(1:1000, function(x) matrix(rnorm(1440), ncol=10))
trial<-lapply(randoms, setNames , nm = letters[1:10])
head(trial[[1]])
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.89032453 1.02459736 0.7141343 -0.47405630 -2.0719943 -1.5087669
[2,] -0.74866047 0.44086093 -1.7540066 -2.04227094 -0.4875453 1.4207707
[3,] -0.04565454 -1.52336294 -0.1941370 -1.36252338 1.7338307 -1.3536725
[4,] 0.13242099 -0.09157545 -0.6156536 -1.34546174 -0.3279853 0.9663668
[5,] 2.09173141 0.41592339 0.7422889 -0.05991624 0.5319697 0.6413341
[6,] -0.32129540 2.11206231 0.1722047 -0.54404820 1.2685971 -0.0784607
[,7] [,8] [,9] [,10]
[1,] -0.4849624 -1.2590439 -1.5066718 -0.6758746
[2,] -2.5010320 -2.3469163 0.5221117 0.9186142
[3,] -1.3763468 -0.5551194 -0.2304872 -1.6087508
[4,] -2.0282231 -0.1949064 0.9329241 1.0196325
[5,] 1.6429999 1.8176161 -0.6549447 -1.8833887
[6,] 1.0044023 1.5895154 0.3660308 -0.1883819
head(attr(trial[[1]], "names"))
[1] "a" "b" "c" "d" "e" "f"
We can use a for loop
for(i in seq_along(randoms)) {
colnames(randoms[[i]]) <- letters[1:10]
}

Replacing matrix columns in R

I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.

Taking the average

I have a list of three data frames, each with n columns (6 in this case) and r rows (3 in this case). I want to create a matrix with the same dimensions (n*r) of the average of the three data frames within the list. So, for instance, [1,1] of the output matrix should be the average of 0.2470748, 0.2558439 and 0.2439057. Any ideas on how to do this?
We can use Reduce
Reduce(`+`, lis)/length(lis)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
Or another option is apply (as mentioned in the comments by #Ananda Mahto)
apply(simplify2array(lis), c(1,2), mean)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
The advantage of mean function is that if there are NA values, we can use na.rm=TRUE as argument.

Resources