Generate all combinations of length 2 using 3 letters - r

The simplest way to generate such combinations would be to use combn function as follows:
print(combn(letters[1:3],2))
The output that it generates is:
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "b" "c" "c"
This doesn't generate combinations like aa, bb (same character repeated).
This even doesn't generate ba if ab is generated.
I want to generate all such combinations of length 2 for a vector [a,b,c].
Is there a simple way to do so in R?

The paste0 function is vectorized and so succeeds with outer:
outer(c('a','b','c'), c('a','b','c'), paste0)
[,1] [,2] [,3]
[1,] "aa" "ab" "ac"
[2,] "ba" "bb" "bc"
[3,] "ca" "cb" "cc"

Related

Combine character matrices into one

Given two matrices
a <- matrix(c("a","","","d"),2,2)
b <- matrix(c("","b","",""),2,2)
a
[,1] [,2]
[1,] "a" ""
[2,] "" "d"
b
[,1] [,2]
[1,] "" ""
[2,] "b" ""
Is there an easy way to combine these two into one and get
[,1] [,2]
[1,] "a" ""
[2,] "b" "d"
without looping over each individual element?
I am interested in the problem of this "merge" in general. However, for the time being, each cell is non-empty in only one of those matrices (i.e., the case where cell [1,1] contains something in matrix a and in matrix b is ruled out).
If two matrices are of same dimension we can do :
ifelse(a == '', b, a)
# [,1] [,2]
#[1,] "a" ""
#[2,] "b" "d"
You can also do:
a[a == "" & b != ""] <- b[b != ""]
a
[,1] [,2]
[1,] "a" ""
[2,] "b" "d"
We can also use. case_when
library(dplyr)
case_when(a== '' ~ b, TRUE ~ a)

Replace values in one matrix with values from another

I am a programming newbie attempting to compare two matrices. In case an element from first column in mat1 matches any element from first column in mat2, then I want that matching element in mat1 to be replaced with the neighboor (same row different column) to the match in mat2.
INPUT:
mat1<-matrix(letters[1:5])
mat2<-cbind(letters[4:8],1:5)
> mat1
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "d"
[5,] "e"
> mat2
[,1] [,2]
[1,] "d" "1"
[2,] "e" "2"
[3,] "f" "3"
[4,] "g" "4"
[5,] "h" "5"
wished OUTPUT:
> mat3
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "1"
[5,] "2"
I have attempted the following without succeeding:
> for(x in mat1){mat3<-ifelse(x==mat2,mat2[which(x==mat2),2],mat1)}
> mat3
[,1] [,2]
[1,] "a" "a"
[2,] "2" "b"
[3,] "c" "c"
[4,] "d" "d"
[5,] "e" "e"
Any advice will be very appreciated. Have spent a whole day without making it work. It doesn't matter to me if the elements are in a matrix or a data frame.
Thanks.
ifelse is vectorized so, we can use it on the whole column. Create the test logical condition in ifelse by checking whether the first column values of 'mat1' is %in% the first column of 'mat2', then , get the index of the corresponding values with match, extract the values of the second column with that index, or else return the first column of 'mat1'
mat3 <- matrix(ifelse(mat1[,1] %in% mat2[,1],
mat2[,2][match(mat1[,1], mat2[,1])], mat1[,1]))
mat3
# [,1]
#[1,] "a"
#[2,] "b"
#[3,] "c"
#[4,] "1"
#[5,] "2"
Here is another base R solution
v <- `names<-`(mat2[,2],mat2[,1])
mat3 <- matrix(unname(ifelse(is.na(v[mat1]),mat1,v[mat1])))
which gives
> mat3
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "1"
[5,] "2"
An option just using logical operation rather than a function
mat3 <- mat1
mat3[mat1[,1] %in% mat2[,1], 1] <- mat2[mat2[,1] %in% mat1[,1], 2]
Subsetting the values to find those that occur in both and replacing them where they do

Make r ignore the order at which values appear in a column (created from pasting multiple columns)

Given a variable x that can take values A,B,C,D
And three columns for variable x:
df1<-
rbind(c("A","B","C"),c("A","D","C"),c("B","A","C"),c("A","C","B"), c("B","C","A"), c("D","A","B"), c("A","B","D"), c("A","D","C"), c("A",NA,NA),c("D","A",NA),c("A","D",NA))
How do I make column indicating the combination of in the three preceding column such that permutations (ABC, ACB, BAC) would be considered as the same combination of ABC, (AD, DA) would be considered as the same combination of AD?
Pasting the three columns with apply(df1,1,function(x) paste(x[!is.na(x)], collapse=", ")->df1$x4 and using df1%>%group(x4)%>%summarize(c=count(x4)) would count AD,DA as different instead of the same.
Edited title
My desired result would be to get
a<-cbind(c("ABC",4),c("ACD",2),c("ABD",2),c("A",1),c("AD",2))
Someone already solved my question. Thanks
You can apply function paste after sorting each row vector.
df1 <-
cbind(df1, apply(df1, 1, function(x) paste(sort(x), collapse = "")))
df1
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "ABC"
# [2,] "A" "D" "C" "ACD"
# [3,] "B" "A" "C" "ABC"
# [4,] "A" "C" "B" "ABC"
# [5,] "B" "C" "A" "ABC"
# [6,] "D" "A" "B" "ABD"
# [7,] "A" "B" "D" "ABD"
# [8,] "A" "D" "C" "ACD"
# [9,] "A" NA NA "A"
#[10,] "D" "A" NA "AD"
#[11,] "A" "D" NA "AD"
You can now simply table the column, with no need for an external package to be loaded and more complex pipes.
table(df1[, 4])
#A ABC ABD ACD AD
#1 4 2 2 2

R: duplicates elimination in a matrix, keeping track of multiplicities

I have a basic problem with R.
I have produced the matrix
M
[,1] [,2]
[1,] "a" "1"
[2,] "b" "2"
[3,] "a" "3"
[4,] "c" "1"
I would like to obtain the 3X2 matrix
[,1] [,2] [,3]
[1,] "a" "1" "3"
[2,] "b" "2" NA
[3,] "c" "1" NA
obtained by eliminating duplicates in M[,1] and writing in N[i,2], N[i,3] the values in M[,2] corresponding to the same element in M[,1], for all i's. The "NA"'s in N[,3] correspond to the singletons in M[,1].
I know how to eliminate duplicates from a vector in R: my problem is to keep track of the elements in M[,2] and write them in the resulting matrix N. I tried with for cycles but they do not work so well in my "real world" case, where the matrices are much bigger.
Any suggestions?
I thank you very much.
You can use dcast in the reshape2 package after turning your matrix to a data.frame. To reverse the process you can use melt.
df = data.frame(c("a","b","a","c"),c(1:3,1))
colnames(df) = c("factor","obs")
require(reshape2)
df2=dcast(df, factor ~ obs)
now df2 is:
factor 1 2 3
1 a 1 NA 3
2 b NA 2 NA
3 c 1 NA NA
To me it makes more sense to keep it like this. But if you need it in your format:
res = t(apply(df2,1,function(x) { newLine = as.vector(x[which(!is.na(x))],mode="any"); newLine=c(newLine,rep(NA, ncol(df2)-length(newLine) )) }))
res = res[,-ncol(res)]
[,1] [,2] [,3]
[1,] "a" " 1" " 3"
[2,] "b" " 2" NA
[3,] "c" " 1" NA

Creating a full diallel using R

I am relativly new to R, excuse if this question is too basic.
I am wondering whether there is a good and fast way to create a full diallel using R?
I have a matrix that looks likes:
M1 M2 M3
Line1 A B A
Line2 A A B
Line3 B A A
From this matrix I would like to create the following data frame:
X Y M1 M2 M3
Line1 Line1 AA BB AA
Line1 Line2 AA BA AB
Line1 Line3 AB BA AA
Line2 Line1 AA AB BA
Line2 Line2 AA AA BB
Line2 Line3 AB AA BA
Line3 Line1 BA AB AA
Line3 Line2 BA AA AB
Line3 Line3 BB AA AA
I think this might be possible by creating a couple of nested loops and using paste to combine the A and B lettercodes. But probably there are better and more "R-like" options (using cbind()?).
One approach is to think of the indices of the rows of your data that make up each line of the desired output. Using your data:
mat <- matrix(c("A","B","A",
"A","A","B",
"B","A","A"), ncol = 3, byrow = TRUE)
I create those indices using expand.grid(). The first row of your output is formed by the concatenation of row 1 of mat with row 1 of mat, and so on. These indices are produced as follows
> ind <- expand.grid(r1 = 1:3, r2 = 1:3)
> ind
r1 r2
1 1 1
2 2 1
3 3 1
4 1 2
5 2 2
6 3 2
7 1 3
8 2 3
9 3 3
Note that to get what your output shows we need to take columns r2 then r1 rather than the other way round.
Now I just index mat with the second column of ind and the first column of ind and supply that to paste0() the output from which is a vector so we need to reshape it to a matrix.
> matrix(paste0(mat[ind[,2], ], mat[ind[,1], ]), ncol = 3)
[,1] [,2] [,3]
[1,] "AA" "BB" "AA"
[2,] "AA" "BA" "AB"
[3,] "AB" "BA" "AA"
[4,] "AA" "AB" "BA"
[5,] "AA" "AA" "BB"
[6,] "AB" "AA" "BA"
[7,] "BA" "AB" "AA"
[8,] "BA" "AA" "AB"
[9,] "BB" "AA" "AA"
The paste0() step returns a vector of the pasted strings:
> paste0(mat[ind[,2], ], mat[ind[,1], ])
[1] "AA" "AA" "AB" "AA" "AA" "AB" "BA" "BA" "BB" "BB" "BA" "BA" "AB" "AA" "AA"
[16] "AB" "AA" "AA" "AA" "AB" "AA" "BA" "BB" "BA" "AA" "AB" "AA"
The trick as to why the matrix restructuring shown above works is to note that the entries in the output from paste0() are in column-major order because of how the index ind was formed. Essentially the two arguments passed to paste0() are:
> mat[ind[,2], ]
[,1] [,2] [,3]
[1,] "A" "B" "A"
[2,] "A" "B" "A"
[3,] "A" "B" "A"
[4,] "A" "A" "B"
[5,] "A" "A" "B"
[6,] "A" "A" "B"
[7,] "B" "A" "A"
[8,] "B" "A" "A"
[9,] "B" "A" "A"
> mat[ind[,1], ]
[,1] [,2] [,3]
[1,] "A" "B" "A"
[2,] "A" "A" "B"
[3,] "B" "A" "A"
[4,] "A" "B" "A"
[5,] "A" "A" "B"
[6,] "B" "A" "A"
[7,] "A" "B" "A"
[8,] "A" "A" "B"
[9,] "B" "A" "A"
R treats each as a vector and hence the output is a vector, but because R stores matrices by columns, we fill our output matrix with the pasted strings by columns also.
You might not need a couple of loops to get your output, here is a suggestion:
To start with, let's generate your sample matrix:
M <- matrix(c("A","B","A","A","A","B","B","A","A"), ncol = 3, byrow = TRUE)
rownames(M) <- c("Line1","Line2","Line3")
colnames(M) <- c("M1","M2","M3")
An easy to generate all possible pairs between items in a vector is to use expand.grid():
d <- expand.grid(rownames(M), rownames(M))
Generates the columns X and Y in your desired output:
Var1 Var2
1 Line1 Line1
2 Line2 Line1
3 Line3 Line1
4 Line1 Line2
5 Line2 Line2
6 Line3 Line2
7 Line1 Line3
8 Line2 Line3
9 Line3 Line3
Then, what you could do is to apply() a function to each row that pastes together the corresponding M1,M2,M3 values:
apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} )
It will generate the right combinations, but not with the right format (yet):
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] "AA" "AA" "BA" "AA" "AA" "BA" "AB" "AB" "BB"
[2,] "BB" "AB" "AB" "BA" "AA" "AA" "BA" "AA" "AA"
[3,] "AA" "BA" "AA" "AB" "BB" "AB" "AA" "BA" "AA"
To flip the matrix in the right direction, you simply have to transpose it.
From there, you can wrap everything into a data frame, in one go:
df <- data.frame( d, t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
colnames(df) <- c("X","Y","M1","M2", "M3")
and here it is.
To be more efficient, you can finally write a little function to which you submit any M matrix.
get.it <- function(M){
d <- expand.grid(rownames(M), rownames(M))
e <- t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
output<- data.frame( d, e)
colnames(output) <- c("X","Y","M1","M2","M3")
return(output)
}
and get.it(M) should work!

Resources