R: Apply family that deletes columns as part of the function - r

I am trying to iterate through each row in a matrix, find the column with the minimum value and the column name and then delete that column after it has been used so that a new minimum can be calculated. The correct answer should look like this:
result
1/1 50
2/2 61
3/3 72
4/4 83
Test_Matrix <- matrix(c(50:149), ncol = 10 , byrow=FALSE)
Names <- c(1:10)
colnames(Test_Matrix) <- Names
rownames(Test_Matrix) <- Names
result <- t(sapply(seq(nrow(Test_Matrix)), function(i) {
j <- which.min(Test_Matrix[i,])
c(paste(rownames(Test_Matrix)[i], colnames(Test_Matrix)[j], sep='/'), Test_Matrix[i,j])
drops <- colnames(Test_Matrix)[j]
Test_Matrix[ , !(names(Test_Matrix) %in% drops)]
}))
result
Second question is that I would like to choose the order of the rows during the iteration so that it chooses to go to the next row that had the same name as the column name. For example, if the column with the minimum was named 5, column 5 would be deleted and the minimum for the row named 5 would be calculated next.
Wondering if this is possible and if a loop is needed for these calculations.
As a new R user, I appreciate any help.
Thanks!

For the first part of your question:
Test_Matrix <- matrix(c(50:149), ncol = 10 , byrow=FALSE)
Names <- c(1:10)
colnames(Test_Matrix) <- Names
rownames(Test_Matrix) <- Names
result <- matrix(nrow=0, ncol=2)
for (i in 1:nrow(Test_Matrix)) {
Test_Matrix <- as.matrix(Test_Matrix) #when Test_Matrix has only 1 column R converts it into a vector
j <- which.min(Test_Matrix[i, ])
result <- rbind(result, c(paste(rownames(Test_Matrix)[i],
colnames(Test_Matrix)[j], sep='/'),
as.numeric(Test_Matrix[i,j])))
Test_Matrix <- Test_Matrix[, -j] #remove column j
}
result
## [,1] [,2]
## [1,] "1/1" "50"
## [2,] "2/2" "61"
## [3,] "3/3" "72"
## [4,] "4/4" "83"
## [5,] "5/5" "94"
## [6,] "6/6" "105"
## [7,] "7/7" "116"
## [8,] "8/8" "127"
## [9,] "9/9" "138"
##[10,] "10/" "149"
Edit: For the second part, instead of the for loop, you can use this:
i <- 1
while(length(Test_Matrix)>0) {
Test_Matrix <- as.matrix(Test_Matrix)
j <- which.min(Test_Matrix[i, ])
result <- rbind(result, c(paste(rownames(Test_Matrix)[i],
colnames(Test_Matrix)[j], sep='/'),
as.numeric(Test_Matrix[i,j])))
Test_Matrix <- Test_Matrix[, -j]
i <- as.numeric(names(j))+1
}

Related

Replacing pair of element of symmetric matrix with NA

I have a positive definite symmetric matrix. Pasting the matrix generated using the following code:
set.seed(123)
m <- genPositiveDefMat(
dim = 3,
covMethod = "unifcorrmat",
rangeVar = c(0,1) )
x <- as.matrix(m$Sigma)
diag(x) <- 1
x
#Output
[,1] [,2] [,3]
[1,] 1.0000000 -0.2432303 -0.4110525
[2,] -0.2432303 1.0000000 -0.1046602
[3,] -0.4110525 -0.1046602 1.0000000
Now, I want to run the matrix through iterations and in each iteration I want to replace the symmetric pair with NA. For example,
Iteration 1:
x[1,2] = x[2,1] <- NA
Iteration2:
x[1,3] = x[3,1] <- NA
and so on....
My idea was to check using a for loop
Prototype:
for( r in 1:nrow(x)
for( c in 1:ncol(x)
if x[r,c]=x[c,r]<-NA
else
x[r,c]
The issue with my code is for row 1 and column 1, the values are equal hence it sets to 0 (which is wrong). Also, the moment it is not NA it comes out of the loop.
Appreciate any help here.
Thanks
If you need the replacement done iteratively, you can use the indexes of values represented by upper.tri(x)/lower.tri to do the replacements pair-by-pair. That will allow you to pass the results to a function before/after each replacement, e.g.:
idx <- which(lower.tri(mat), arr.ind=TRUE)
sel <- cbind(
replace(mat, , seq_along(mat))[ idx ],
replace(mat, , seq_along(mat))[ idx[,2:1] ]
)
# [,1] [,2]
#[1,] 2 4 ##each row represents the lower/upper pair
#[2,] 3 7
#[3,] 6 8
for( i in seq_len(nrow(sel)) ) {
mat[ sel[i,] ] <- NA
print(mean(mat, na.rm=TRUE))
}
#[1] 0.2812249
#[1] 0.5581359
#[1] 1

Run a for loop over a list, save each list object in output matrix

I want to run a for loop over a list. I have 3 questions:
I wonder how I select each list in "i in 1:list", when doing the same for a data frame it can be for(i in 1:ncol(df)), but how should this be written for a list?
I also wonder how I could "add on" the values for each loop to the " output" matrix?
How to convert a list to a data frame. When doing it like data.frame(df3) they will be added column wise, number of rows will always be 3.
Many thanks if someone has suggestions!
name <- rep("gg",3)
id <- LETTERS[1:3]
emmeans <- runif(1:3)
SE <- runif(1:3)
p <- rep(c(0.001),3)
df <- data.frame(name,id, emmeans,p)
df
df <- list(df)
name <- rep("ff",3)
id <- LETTERS[1:3]
emmeans <- runif(1:3)
SE <- runif(1:3)
p <- rep(c(0.003),3)
df2 <- data.frame(name,id, emmeans,p)
df2 <- list(df2)
df3 <- list(df,df2)
df3
> df3
[[1]]
[[1]][[1]]
name id emmeans p
1 gg A 0.2248491 0.001
2 gg B 0.4213938 0.001
3 gg C 0.3671521 0.001
[[2]]
[[2]][[1]]
name id emmeans p
1 ff A 0.2561801 0.003
2 ff B 0.1705811 0.003
3 ff C 0.9714178 0.003
output <- matrix(nrow=2, ncol=5)
for(i in 1:list){ # what could be written here?
d <- as.data.frame(df3[[i]])
for(j in 1:nrow(d)){
output[i,1] <- d[1,3]
output[i,2] <- d[2,3]
output[i,3] <- d[3,3]
output[i,4] <- d[1,4]
output[i,5] <- d[1,1]
}
}
#Wanted outcome:
output
[,1] [,2] [,3] [,4] [,5]
[1,] "0.224849119782448" "0.421393777942285" "0.367152112303302" "0.001" "gg"
[2,] "0.256180095253512" "0.170581063022837" "0.971417842432857" "0.003" "ff"
This data is weirdly structured enough, and the extraction is unique enough, that I think a for loop is the easiest way. Something like this:
output <- matrix(nrow = length(df3), ncol=5)
for(i in seq_along(df3)) {
output[i, ] = with(df3[[i]], c(emmeans, p[1], name[1]))
}
output
# [,1] [,2] [,3] [,4] [,5]
# [1,] "0.0101301828399301" "0.21454192395322" "0.913734979229048" "0.003" "ff"
# [2,] "0.0101301828399301" "0.21454192395322" "0.913734979229048" "0.003" "ff"

How it iterate over a matrix using a function in R?

I have created a function to order a vector of length 2, using the following code
x = (c(6,2))
orders = function(x){
for(i in 1:(length(x)-1)){
if(x[i+1] < x[i]){
return(c(x[i+1], x[i]))} else{
(return(x))
}}}
orders(x)
I have been asked to use this function to process a dataset with 2 columns as follows. Iterate over the rows of the
data set, and if the element in the 2nd column of row i is less than the element in the first
column of row i, switch the order of the two entries in the row by making a suitable call to
the function you just wrote.
I've tried using the following code
set.seed(1128719)
data=matrix(rnorm(20),byrow=T,ncol=2)
df = for (i in 1:2) {
for(j in 1:10){
data = orders(c(x[i], x[j]))
return(data)
}
}
The output is null. I'm not quite sure where I'm going wrong.
Any suggestions?
I modified your code a bit but tried to keep the 'style' the same
Ther is no need for a loop
i in 1:(length(x)-1) always evaluates to
for i in 1:1 and i will only take the value of 1.
orders = function(x){
# Since the function will only work on vectors of length 2
# its good practice to raise an error right at the start
#
if (length(x) != 2) {
stop("x must be vector of lenght 2")
}
if (x[2] < x[1]) {
return(c(x[2], x[1]))
} else {
return(x)
}
}
orders(c(6, 2))
set.seed(1128719)
data <- matrix(rnorm(20),byrow=T,ncol=2)
The for loop itself cant be assigned to a variable
But we use the loop to mutate the matrix 'data'
in place
for (row in 1:nrow(data)) {
data[row, ] <- orders(data[row,])
}
data
Edit:
This is the input:
[,1] [,2]
[1,] -0.04142965 0.2377140
[2,] -0.76237866 -0.8004284
[3,] 0.18700893 -0.6800310
[4,] 0.76499646 0.4430643
[5,] 0.09193440 -0.2592316
[6,] 1.17478053 -0.4044760
[7,] -1.62262500 0.1652850
[8,] -1.54848857 0.7475451
[9,] -0.05907252 -0.8324074
[10,] -1.11064318 -0.1148806
This is the output i get:
[,1] [,2]
[1,] -0.04142965 0.23771403
[2,] -0.80042842 -0.76237866
[3,] -0.68003104 0.18700893
[4,] 0.44306433 0.76499646
[5,] -0.25923164 0.09193440
[6,] -0.40447603 1.17478053
[7,] -1.62262500 0.16528496
[8,] -1.54848857 0.74754509
[9,] -0.83240742 -0.05907252
[10,] -1.11064318 -0.11488062
Here are two ways of ordering the 2 columns matrix.
This is the test matrix posted in the question.
set.seed(1128719)
data <- matrix(rnorm(20), byrow = TRUE, ncol = 2)
1. With a function orders.
The function expects as input a 2 element vector. If they are out of order, return the vector with its elements reversed, else return the vector as is.
orders <- function(x){
stopifnot(length(x) == 2)
if(x[2] < x[1]){
x[2:1]
}else{
x
}
}
Test the function.
x <- c(6,2)
orders(x)
#[1] 2 6
Now with the matrix data.
df1 <- t(apply(data, 1, orders))
2. Vectorized code.
Creates a logical index with TRUE whenever the elements are out of order and reverse only those elements.
df2 <- data
inx <- data[,2] < data[,1]
df2[inx, ] <- data[inx, 2:1]
The results are the same.
identical(df1, df2)
#[1] TRUE

calculate every pairwise quotient of a list of vectors in a dataframe, store as a new object in R

Say I have 4 vectors, in data frame datt:
a<-rnorm(10,30)
b<-rnorm(10,20)
c<-rnorm(10,40)
d<-rnorm(10,100)
datt<- data.frame(a,b,c,d)
I then have a list of vectors within the dataframe:
list<-c(a,b,c)
I'd like to calculate every pairwise quotient of vectors within the dataframe that are included in my list. so a/b, a/c, and b/c, and store those vectors as a new dataframe or array. Bonus if the names of the columns in the new array are a.b,a.c,b.c.
I guess this works:
# alternate naming
L <- list(a=a,b=b,c=c)
# or even easier
L <- datt[c("a","b","c")]
res <- combn(L, 2, function(x) x[[1]]/x[[2]])
To assign names...
colnames(res) <- apply(combn(names(L),2),2,paste,collapse=".")
With set.seed(1) before drawing sample data, this gives:
a.b a.c b.c
[1,] 1.365463 0.7178465 0.5257165
[2,] 1.480327 0.7401192 0.4999700
[3,] 1.504966 0.7277527 0.4835676
[4,] 1.776483 0.8312218 0.4679031
[5,] 1.435721 0.7466676 0.5200645
[6,] 1.462262 0.7305134 0.4995777
[7,] 1.525606 0.7651660 0.5015487
[8,] 1.467655 0.7977920 0.5435828
[9,] 1.468491 0.7736425 0.5268281
[10,] 1.441913 0.7346889 0.5095238
# for comparison
> datt$a/datt$b
[1] 1.365463 1.480327 1.504966 1.776483 1.435721 1.462262 1.525606
[8] 1.467655 1.468491 1.441913
As suggested by #akrun, it is not necessary to run combn twice:
do.call('data.frame', combn(L, 2, simplify = FALSE, FUN = function(x)
setNames(list(x[[1]]/x[[2]]), paste(names(x), collapse=':'))
))

Retrieving row and column names in R

So I have a matrix TMatrix that i'm cycling through, and I want to put the row and column names for every cell that contains a value that is not finite into a table. I've tried to doing the following, but I keep getting NA for the row and column names. What's going on?
AA <- 1:rowlength
BB <- 1:ncol(Nmatrix)
for(i in AA){
for(j in BB){
if (is.finite(TMatrix[i,j])==FALSE){
TNS <- matrix(data=NA,nrow=1,ncol=4)
TNS[1,1] <- TMatrix[i,j]
TNS[1,2] <- Nmatrix[i,j]
TNS[1,3] <- paste(rownames(TMatrix)[TMatrix[i,j]])
TNS[1,4] <- paste(colnames(TMatrix)[TMatrix[i,j]])
TMinf <- rbind(TMinf,TNS)
}
PMatrix[i,j] <- pt(TMatrix[i,j],n1+n2-2)
}
}
No idea what this is doing because you provided zero of the objects we would need to run this, but it sounds like you are wanting to do something in the following example:
mat <- matrix(rnorm(20), nrow = 4)
mat[1, 4] <- mat[3, 2] <- NA
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.11025848 1.1021023 -0.3098129 NA -0.1358902
# [2,] 0.00351275 0.1440906 1.2141437 0.2601651 0.2504035
# [3,] -1.11565805 NA 0.1483867 -0.4102958 -0.3104319
# [4,] 0.34785864 1.5319365 1.2750632 0.1259548 -0.7594117
which(!is.finite(mat), arr.ind = TRUE)
# row col
# [1,] 3 2
# [2,] 1 4
If you have the rows/columns named:
colnames(mat) <- LETTERS[1:5]
rownames(mat) <- letters[1:4]
# A B C D E
# a 0.11025848 1.1021023 -0.3098129 NA -0.1358902
# b 0.00351275 0.1440906 1.2141437 0.2601651 0.2504035
# c -1.11565805 NA 0.1483867 -0.4102958 -0.3104319
# d 0.34785864 1.5319365 1.2750632 0.1259548 -0.7594117
idx <- which(!is.finite(mat), arr.ind = TRUE)
rownames(mat)[idx[ , 'row']]
# [1] "c" "a"
colnames(mat)[idx[ , 'col']]
# [1] "B" "D"
Never mind, I figured it out. I had the index wrong. It should be like this:
AA <- 1:rowlength
BB <- 1:ncol(Nmatrix)
for(i in AA){
for(j in BB){
if (is.finite(TMatrix[i,j])==FALSE){
TNS <- matrix(data=NA,nrow=1,ncol=4)
TNS[1,1] <- TMatrix[i,j]
TNS[1,2] <- Nmatrix[i,j]
TNS[1,3] <- rownames(TMatrix)[i]
TNS[1,4] <- colnames(TMatrix)[j]
TMinf <- rbind(TMinf,TNS)
}
PMatrix[i,j] <- pt(TMatrix[i,j],n1+n2-2)
}
}

Resources