Making symmetrical matrix of characters [duplicate] - r

This question already has answers here:
Creating a symmetric matrix in R
(7 answers)
Closed 2 years ago.
I have a matrix of characters:
mat1
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "0" "B" "A" "C" "D" "D"
[2,] "0" "0" "B" "C" "C" "C"
[3,] "0" "0" "0" "D" "D" "C"
[4,] "0" "0" "0" "0" "B" "B"
[5,] "0" "0" "0" "0" "0" "A"
[6,] "0" "0" "0" "0" "0" "0"
I want to have a Symmetrical matrix of that, as below:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "0" "B" "A" "C" "D" "D"
[2,] "B" "0" "B" "C" "C" "C"
[3,] "A" "B" "0" "D" "D" "C"
[4,] "C" "C" "D" "0" "B" "B"
[5,] "D" "C" "D" "B" "0" "A"
[6,] "D" "C" "C" "B" "A" "0"

You can set the lower triangular part of the matrix as equal to the lower triangular part of the transposed matrix, by using the lower.tri functions on the matrix mat1:
mat1[lower.tri(mat1)] <- t(mat1)[lower.tri(mat1)]

Related

In a list object ordering characters in each row

I have a list object. I used strsplit(data, ";") to unlist. It is characters "A";"B" so on and each row has different length. Therefore, I wrote a for loop to create a matrix. I want to have column 1 all same object "A".
Here is the code I wrote but it does not work as I wanted.
myList <- list()
myList[[1]] <- c("A", "C", 0, 0)
myList[[2]] <- c("A", "B", "C")
myList[[3]] <- c("A", 0, 0, 0, 0)
myList[[4]] <- c("B", "A")
myList[[5]] <- c("Aa", "A", "B", 0, 0)
myList[[6]] <- c("Aa", "A", "C", 0, 0)
myList[[7]] <- c("C", "A", 0, 0)
myList
TD=TD2=matrix(0,length(myList),5)
for(i in 1:length(myList))
{
m1=length(myList[[i]])
TD[i,1:m1]=matrix( myList[[i]] , ncol = m1 , byrow = TRUE )
}
for(j in 1:length(myList)){
TD2[j,]=TD[j,order(TD[j,],decreasing = T)]
}
Desired output to be
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "C" "0" "0" "0"
[2,] "A" "C" "B" "0" "0"
[3,] "A" "0" "0" "0" "0"
[4,] "A" "B" "0" "0" "0"
[5,] "A" "B" "Aa" "0" "0"
[6,] "A" "C" "Aa" "0" "0"
[7,] "A" "C" "0" "0" "0"
You can define a factor object with custom order with factor(..., ordered = T) and sort it.
ord <- names(sort(table(unlist(myList))[-1], dec = T))
len <- max(lengths(myList))
t(sapply(myList, function(x){
y <- sort(factor(x, levels = c(ord, "0"), ordered = T))[1:len]
replace(y, is.na(y), "0")
}))
# [,1] [,2] [,3] [,4] [,5]
# [1,] "A" "C" "0" "0" "0"
# [2,] "A" "C" "B" "0" "0"
# [3,] "A" "0" "0" "0" "0"
# [4,] "A" "B" "0" "0" "0"
# [5,] "A" "B" "Aa" "0" "0"
# [6,] "A" "C" "Aa" "0" "0"
# [7,] "A" "C" "0" "0" "0"
Not exactly what you want, but since you can create a frequency table, this should also work:
TD = matrix(0,length(myList),5)
for (i in 1:length(myList)) {
myList[[i]] = sort(myList[[i]][which(myList[[i]] != "0")])
for (j in 1:length(myList[[i]])) TD[i, j] = myList[[i]][j]
}
> TD
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "C" "0" "0" "0"
[2,] "A" "B" "C" "0" "0"
[3,] "A" "0" "0" "0" "0"
[4,] "A" "B" "0" "0" "0"
[5,] "A" "Aa" "B" "0" "0"
[6,] "A" "Aa" "C" "0" "0"
[7,] "A" "C" "0" "0" "0"

Unique Sets Permutations R [duplicate]

This question already has answers here:
How to generate permutations or combinations of object in R?
(3 answers)
Closed 4 years ago.
Is there a way to generate all the unique sets of the following permutations, where I am able to change N and R easily.
library(gtools)
x <- c("A","B","C","D")
x <- permutations(n=4,r=2,v=x)
x
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "A" "D"
[4,] "B" "A"
[5,] "B" "C"
[6,] "B" "D"
[7,] "C" "A"
[8,] "C" "B"
[9,] "C" "D"
[10,] "D" "A"
[11,] "D" "B"
[12,] "D" "C"
For example sets 1 and 4 are not unique, AB and BA contain the same characters.
The following list is unique, and this is what I want.
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "A" "D"
[4,] "B" "C"
[5,] "B" "D"
[6,] "C" "D"
conbn would give you what you need:
#combn gives you the combinations, t is only used to transpose the matrix
t(combn(x, 2))
# [,1] [,2]
#[1,] "A" "B"
#[2,] "A" "C"
#[3,] "A" "D"
#[4,] "B" "C"
#[5,] "B" "D"
#[6,] "C" "D"

Replacing values in a dataframe by row in R

Is there a way in R to replace values in each row of a matrix/dataframe with a specific value from that row?
For example, I have the following matrix:
df<-cbind(c("A","C","G","T"),c("T","G","C","A"),c(0,1,0,1),c(1,0,1,0),c(0,1,0,1))
df
# [,1] [,2] [,3] [,4] [,5]
#[1,] "A" "T" "0" "1" "0"
#[2,] "C" "G" "1" "0" "1"
#[3,] "G" "C" "0" "1" "0"
#[4,] "T" "A" "1" "0" "1"
and I want to replace the zeros in each row with the corresponding letter from the first column of that row, such that the new matrix will look like this:
newdf
# [,1] [,2] [,3] [,4] [,5]
#[1,] "A" "T" "A" "1" "A"
#[2,] "C" "G" "1" "C" "1"
#[3,] "G" "C" "G" "1" "G"
#[4,] "T" "A" "1" "T" "1"
The closest I have been able to get is with the following commands, but it does not replace the zeros with the correct values from column 1.
df[df==0]<-NA
df[, 3:ncol(df)][is.na(df[, 3:ncol(df)])] <- df[,1]
We can replicate the first column to make the lengths equal and then do the assignment based on the logical matrix. It will subset the elements that are of the same length as in the rhs
i1 <- df == 0
newdf <- df
newdf[i1] <- df[,1][row(df)][i1]
newdf
[,1] [,2] [,3] [,4] [,5]
#[1,] "A" "T" "A" "1" "A"
#[2,] "C" "G" "1" "C" "1"
#[3,] "G" "C" "G" "1" "G"
#[4,] "T" "A" "1" "T" "1"

Row number in dataframe based on multiple parameters in R

I wish to find the row number, based on multiple parameters. I have made this test matrix:
data=
[,1] [,2] [,3]
[1,] "1" "a" "0"
[2,] "2" "b" "0"
[3,] "3" "c" "0"
[4,] "4" "a" "0"
[5,] "1" "b" "0"
[6,] "2" "c" "0"
[7,] "3" "a" "0"
[8,] "4" "b" "0"
Then I want to get the row number where
data[,1]==1 and data[,2]=='b'

Exclude rows where element has been previously met for N times

I have following input data:
# [,1] [,2]
#[1,] "A" "B"
#[2,] "A" "C"
#[3,] "A" "D"
#[4,] "B" "C"
#[5,] "B" "D"
#[6,] "C" "D"
Next I want to exclude rows where first or second element has been previously for N times. For example if N = 2 then need to exclude following rows:
#[3,] "A" "D" - element "A" has been 2 times
#[5,] "B" "D" - element "B" has been 2 times
#[6,] "C" "D" - element "C" has been 2 times
Note: Need to take into account excluding results immediately. For example if element has met 5 times and after removing it met only 1 times then need to leave next row with this element. Because now it meets 2 times.
Example (N=2):
Input data:
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "A" "D"
[4,] "A" "E"
[5,] "B" "C"
[6,] "B" "D"
[7,] "B" "E"
[8,] "C" "D"
[9,] "C" "E"
[10,] "D" "E"
Output data:
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[5,] "B" "C"
[10,] "D" "E"
There are possibly more elegant solutions... but this seems to work:
v <- c("A", "B", "C", "D", "E")
cmb <- t(combn(v, 2))
n <- 2
# Go through each letter
for (l in v)
{
# Find the combinations using that letter
rows <- apply(cmb, 1, function(x){l %in% x})
rows.2 <- which(rows==T)
if (length(rows.2)>n)
rows.2 <- rows.2[1:n]
# Take the first n rows containing the letter,
# then append all the ones not containing it
cmb <- rbind(cmb[rows.2,], cmb[rows==F,])
}
cmb
which outputs:
[,1] [,2]
[1,] "D" "E"
[2,] "B" "C"
[3,] "A" "C"
[4,] "A" "B"

Resources