Non-redundant version of expand.grid - r

The R function expand.grid returns all possible combination between the elements of supplied parameters. e.g.
> expand.grid(c("aa", "ab", "cc"), c("aa", "ab", "cc"))
Var1 Var2
1 aa aa
2 ab aa
3 cc aa
4 aa ab
5 ab ab
6 cc ab
7 aa cc
8 ab cc
9 cc cc
Do you know an efficient way to get directly (so without any row comparison after expand.grid) only the 'unique' combinations between the supplied vectors? The output will be
Var1 Var2
1 aa aa
2 ab aa
3 cc aa
5 ab ab
6 cc ab
9 cc cc
EDIT the combination of each element with itself could be eventually discarded from the answer. I don't actually need it in my program even though (mathematically) aa aa would be one (regular) unique combination between one element of Var1 and another of var2.
The solution needs to produce pairs of elements from both vectors (i.e. one from each of the input vectors - so that it could be applied to more than 2 inputs)

How about using outer? But this particular function concatenates them into one character string.
outer( c("aa", "ab", "cc"), c("aa", "ab", "cc") , "paste" )
# [,1] [,2] [,3]
#[1,] "aa aa" "aa ab" "aa cc"
#[2,] "ab aa" "ab ab" "ab cc"
#[3,] "cc aa" "cc ab" "cc cc"
You can also use combn on the unique elements of the two vectors if you don't want the repeating elements (e.g. aa aa)
vals <- c( c("aa", "ab", "cc"), c("aa", "ab", "cc") )
vals <- unique( vals )
combn( vals , 2 )
# [,1] [,2] [,3]
#[1,] "aa" "aa" "ab"
#[2,] "ab" "cc" "cc"

In base R, you can use this:
expand.grid.unique <- function(x, y, include.equals=FALSE)
{
x <- unique(x)
y <- unique(y)
g <- function(i)
{
z <- setdiff(y, x[seq_len(i-include.equals)])
if(length(z)) cbind(x[i], z, deparse.level=0)
}
do.call(rbind, lapply(seq_along(x), g))
}
Results:
> x <- c("aa", "ab", "cc")
> y <- c("aa", "ab", "cc")
> expand.grid.unique(x, y)
[,1] [,2]
[1,] "aa" "ab"
[2,] "aa" "cc"
[3,] "ab" "cc"
> expand.grid.unique(x, y, include.equals=TRUE)
[,1] [,2]
[1,] "aa" "aa"
[2,] "aa" "ab"
[3,] "aa" "cc"
[4,] "ab" "ab"
[5,] "ab" "cc"
[6,] "cc" "cc"

If the two vectors are the same, there's the combinations function in the gtools package:
library(gtools)
combinations(n = 3, r = 2, v = c("aa", "ab", "cc"), repeats.allowed = TRUE)
# [,1] [,2]
# [1,] "aa" "aa"
# [2,] "aa" "ab"
# [3,] "aa" "cc"
# [4,] "ab" "ab"
# [5,] "ab" "cc"
# [6,] "cc" "cc"
And without "aa" "aa", etc.
combinations(n = 3, r = 2, v = c("aa", "ab", "cc"), repeats.allowed = FALSE)

The previous answers were lacking a way to get a specific result, namely to keep the self-pairs but remove the ones with different orders. The gtools package has two functions for these purposes, combinations and permutations. According to this website:
When the order doesn't matter, it is a Combination.
When the order does matter it is a Permutation.
In both cases, we have the decision to make of whether repetitions are allowed or not, and correspondingly, both functions have a repeats.allowed argument, yielding 4 combinations (deliciously meta!). It's worth going over each of these. I simplified the vector to single letters for ease of understanding.
Permutations with repetition
The most expansive option is to allow both self-relations and differently ordered options:
> permutations(n = 3, r = 2, repeats.allowed = T, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "a"
[2,] "a" "b"
[3,] "a" "c"
[4,] "b" "a"
[5,] "b" "b"
[6,] "b" "c"
[7,] "c" "a"
[8,] "c" "b"
[9,] "c" "c"
which gives us 9 options. This value can be found from the simple formula n^r i.e. 3^2=9. This is the Cartesian product/join for users familiar with SQL.
There are two ways to limit this: 1) remove self-relations (disallow repetitions), or 2) remove differently ordered options (i.e. combinations).
Combinations with repetitions
If we want to remove differently ordered options, we use:
> combinations(n = 3, r = 2, repeats.allowed = T, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "a"
[2,] "a" "b"
[3,] "a" "c"
[4,] "b" "b"
[5,] "b" "c"
[6,] "c" "c"
which gives us 6 options. The formula for this value is (r+n-1)!/(r!*(n-1)!) i.e. (2+3-1)!/(2!*(3-1)!)=4!/(2*2!)=24/4=6.
Permutations without repetition
If instead we want to disallow repetitions, we use:
> permutations(n = 3, r = 2, repeats.allowed = F, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "a"
[4,] "b" "c"
[5,] "c" "a"
[6,] "c" "b"
which also gives us 6 options, but different ones! The number of options is the same as above but it's a coincidence. The value can be found from the formula n!/(n-r)! i.e. (3*2*1)/(3-2)!=6/1!=6.
Combinations without repetitions
The most limiting is when we want neither self-relations/repetitions or differently ordered options, in which case we use:
> combinations(n = 3, r = 2, repeats.allowed = F, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
which gives us only 3 options. The number of options can be calculated from the rather complex formula n!/(r!(n-r)!) i.e. 3*2*1/(2*1*(3-2)!)=6/(2*1!)=6/2=3.

Try:
factors <- c("a", "b", "c")
all.combos <- t(combn(factors,2))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
This will not include duplicates of each factor (e.g. "a" "a"), but you can add those on easily if needed.
dup.combos <- cbind(factors,factors)
factors factors
[1,] "a" "a"
[2,] "b" "b"
[3,] "c" "c"
all.combos <- rbind(all.combos,dup.combos)
factors factors
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
[4,] "a" "a"
[5,] "b" "b"
[6,] "c" "c"

You can use a "greater than" operation to filter redundant combinations. This works with both numeric and character vectors.
> grid <- expand.grid(c("aa", "ab", "cc"), c("aa", "ab", "cc"), stringsAsFactors = F)
> grid[grid$Var1 >= grid$Var2, ]
Var1 Var2
1 aa aa
2 ab aa
3 cc aa
5 ab ab
6 cc ab
9 cc cc
This shouldn't slow down your code too much. If you're expanding vectors containing larger elements (e.g. two lists of dataframes), I recommend using numeric indices that refer to the original vectors.

TL;DR
Use comboGrid from RcppAlgos:
library(RcppAlgos)
comboGrid(c("aa", "ab", "cc"), c("aa", "ab", "cc"))
Var1 Var2
[1,] "aa" "aa"
[2,] "aa" "ab"
[3,] "aa" "cc"
[4,] "ab" "ab"
[5,] "ab" "cc"
[6,] "cc" "cc"
The Details
I recently came across this question R - Expand Grid Without Duplicates and as I was searching for duplicates, I found this question. The question there isn't exactly a duplicate, as it is a bit more general and has additional restrictions which #Ferdinand.kraft shined some light on.
It should be noted that many of the solutions here make use of some sort of combination function. The expand.grid function returns the Cartesian product which is fundamentally different.
The Cartesian product operates on multiple objects which may or may not be the same. Generally speaking, combination functions are applied to a single vector. The same can be said about permutation functions.
Using combination/permutation functions will only produce comparable results to expand.grid if the vectors supplied are identical. As a very simple example, consider v1 = 1:3, v2 = 2:4.
With expand.grid, we see that rows 3 and 5 are duplicates:
expand.grid(1:3, 2:4)
Var1 Var2
1 1 2
2 2 2
3 3 2
4 1 3
5 2 3
6 3 3
7 1 4
8 2 4
9 3 4
Using combn doesn't quite get us to the solution:
t(combn(unique(c(1:3, 2:4)), 2))
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 2 3
[5,] 2 4
[6,] 3 4
And with repeats using gtools, we generate too many:
gtools::combinations(4, 2, v = unique(c(1:3, 2:4)), repeats.allowed = TRUE)
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 2 2
[6,] 2 3
[7,] 2 4
[8,] 3 3
[9,] 3 4
[10,] 4 4
In fact we generate results that are not even in the cartesian product (i.e. expand.grid solution).
We need a solution that creates the following:
Var1 Var2
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 2 2
[5,] 2 3
[6,] 2 4
[7,] 3 3
[8,] 3 4
I authored the package RcppAlgos and in the latest release v2.4.3, there is a function comboGrid which addresses this very problem. It is very general, flexible, and is fast.
First, to answer the specific question raised by the OP:
library(RcppAlgos)
comboGrid(c("aa", "ab", "cc"), c("aa", "ab", "cc"))
Var1 Var2
[1,] "aa" "aa"
[2,] "aa" "ab"
[3,] "aa" "cc"
[4,] "ab" "ab"
[5,] "ab" "cc"
[6,] "cc" "cc"
And as, #Ferdinand.kraft points out, sometimes the output may need to have duplicates excluded in a given row. For that, we use repetition = FALSE:
comboGrid(c("aa", "ab", "cc"), c("aa", "ab", "cc"), repetition = FALSE)
Var1 Var2
[1,] "aa" "ab"
[2,] "aa" "cc"
[3,] "ab" "cc"
comboGrid is also very general. It can be applied to multiple vectors:
comboGrid(rep(list(c("aa", "ab", "cc")), 3))
Var1 Var2 Var3
[1,] "aa" "aa" "aa"
[2,] "aa" "aa" "ab"
[3,] "aa" "aa" "cc"
[4,] "aa" "ab" "ab"
[5,] "aa" "ab" "cc"
[6,] "aa" "cc" "cc"
[7,] "ab" "ab" "ab"
[8,] "ab" "ab" "cc"
[9,] "ab" "cc" "cc"
[10,] "cc" "cc" "cc"
Doesn't need the vectors to be identical:
comboGrid(1:3, 2:4)
Var1 Var2
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 2 2
[5,] 2 3
[6,] 2 4
[7,] 3 3
[8,] 3 4
And can be applied to vectors of various types:
set.seed(123)
my_range <- 3:15
mixed_types <- list(
int1 = sample(15, sample(my_range, 1)),
int2 = sample(15, sample(my_range, 1)),
char1 = sample(LETTERS, sample(my_range, 1)),
char2 = sample(LETTERS, sample(my_range, 1))
)
dim(expand.grid(mixed_types))
[1] 1950 4
dim(comboGrid(mixed_types, repetition = FALSE))
[1] 1595 4
dim(comboGrid(mixed_types, repetition = TRUE))
[1] 1770 4
The algorithm employed avoids generating the entirety of the Cartesian product and subsequently removing dupes. Ultimately, we create a hash table using the Fundamental theorem of arithmetic along with deduplication as pointed out by user2357112 supports Monica in the answer to Picking unordered combinations from pools with overlap. All of this together with the fact that it is written in C++ means that it is fast and memory efficient:
pools = list(c(1, 10, 14, 6),
c(7, 2, 4, 8, 3, 11, 12),
c(11, 3, 13, 4, 15, 8, 6, 5),
c(10, 1, 3, 2, 9, 5, 7),
c(1, 5, 10, 3, 8, 14),
c(15, 3, 7, 10, 4, 5, 8, 6),
c(14, 9, 11, 15),
c(7, 6, 13, 14, 10, 11, 9, 4),
c(6, 3, 2, 14, 7, 12, 9),
c(6, 11, 2, 5, 15, 7))
system.time(combCarts <- comboGrid(pools))
user system elapsed
0.929 0.062 0.992
nrow(combCarts)
[1] 1205740
## Small object created
print(object.size(combCarts), unit = "Mb")
92 Mb
system.time(cartProd <- expand.grid(pools))
user system elapsed
8.477 2.895 11.461
prod(lengths(pools))
[1] 101154816
## Very large object created
print(object.size(cartProd), unit = "Mb")
7717.5 Mb

here's a very ugly version that worked for me on a similar problem.
AHP_code = letters[1:10]
temp. <- expand.grid(AHP_code, AHP_code, stringsAsFactors = FALSE)
temp. <- temp.[temp.$Var1 != temp.$Var2, ] # remove AA, BB, CC, etc.
temp.$combo <- NA
for(i in 1:nrow(temp.)){ # vectorizing this gave me weird results, loop worked fine.
temp.$combo[i] <- paste0(sort(as.character(temp.[i, 1:2])), collapse = "")
}
temp. <- temp.[!duplicated(temp.$combo),]
temp.

USING SORT
Just for fun, one can in principle also remove duplicates from expand.grid by combining sort and unique.
unique(t(apply(expand.grid(c("aa", "ab", "cc"), c("aa", "ab", "cc")), 1, sort)))
This gives:
[,1] [,2]
[1,] "aa" "aa"
[2,] "aa" "ab"
[3,] "aa" "cc"
[4,] "ab" "ab"
[5,] "ab" "cc"
[6,] "cc" "cc"

With repetitions (this won't work if you specify different vectors for different columns and for example values in the first column are always bigger than values in the second column):
> v=c("aa","ab","cc")
> e=expand.grid(v,v,stringsAsFactors=F)
> e[!apply(e,1,is.unsorted),]
Var1 Var2
1 aa aa
4 aa ab
5 ab ab
7 aa cc
8 ab cc
9 cc cc
Without repetitions (this requires using the same vector for each column):
> t(combn(c("aa","ab","cc"),2))
[,1] [,2]
[1,] "aa" "ab"
[2,] "aa" "cc"
[3,] "ab" "cc"
With repetitions and with different vectors for different columns:
> e=expand.grid(letters[25:26],letters[1:3],letters[2:3],stringsAsFactors=F)
> e[!duplicated(t(apply(e,1,sort))),]
Var1 Var2 Var3
1 y a b
2 z a b
3 y b b
4 z b b
5 y c b
6 z c b
7 y a c
8 z a c
11 y c c
12 z c c
Without repetitions and with different vectors for different columns:
> e=expand.grid(letters[25:26],letters[1:3],letters[2:3],stringsAsFactors=F)
> e=e[!duplicated(t(apply(e,1,sort))),]
> e[!apply(apply(e,1,duplicated),2,any),]
Var1 Var2 Var3
1 y a b
2 z a b
5 y c b
6 z c b
7 y a c
8 z a c

Related

Split string with n repetitive elements into n sub-strings

I have a string that is a concatenation of m possible types of elements - for the sake of simplicity m = 4 with A, B, C and D.
Whenever there are single elements more than once, I would have to split the string so that there are no repetitions left. However, I would like to generate all possible strings without repetitions.
To make this a little bit clearer, here is an example:
For A B A C D
String: A B C D
String: B A C D
This gets more complicated when there are several different elements that show up more than once:
For A B A C B D
String: A B C D
String: A C B D
String: B A C D
String: A C B D
Is there a smart way to compute this in R?
vec <- c("A","B","A","C","B","D")
combs <- lapply(setNames(nm = unique(vec)), function(a) which(vec == a))
eg <- do.call(expand.grid, combs)
out <- t(apply(eg, 1, function(r) names(eg)[order(r)]))
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
# [3,] "A" "C" "B" "D"
# [4,] "A" "C" "B" "D"
out
First vector:
vec <- c("A","B","A","C","D")
# ...
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "D"
# [2,] "B" "A" "C" "D"
If you are starting and ending with strings vice vectors, then know that you can wrap the above with:
strsplit("ABACBD", "")[[1]]
# [1] "A" "B" "A" "C" "B" "D"
apply(out, 1, paste, collapse = "")
# [1] "ABCD" "BACD" "ACBD" "ACBD"

Replace the contents of a vector with the values of a matrix

Well, I hope I explain it simple:
I have a matrix:
matrix(c("a","b","c",1,2,3), nrow=3, ncol=2)
with output:
[,1] [,2]
[1,] "a" "1"
[2,] "b" "2"
[3,] "c" "3"
I have a vector, for example:
vector1 <- c("b", "a", "b", "c")
I want that another vector to pick the values associated of the matrix that appear on the vector. I mean, the final vector must be:
[1] 2 1 2 3
I can't figure it out at the moment.
Thank you
Try match where 'm1' is the matrix
match(vector1, m1[,1])
#[1] 2 1 2 3
Or
unname(setNames(as.numeric(m1[,2]), m1[,1])[vector1])
#[1] 2 1 2 3

Appending values with different order in R

I have two data elements in R:
data1
1 M
2 T
3 Z
4 A
5 J
data2 values
[1,] "A" "aa"
[2,] "J" "ab"
[3,] "M" "ac"
[4,] "T" "ad"
[5,] "Z" "ae"
I would like to get:
data1 values
[1,] "M" "ac"
[2,] "T" "ad"
[3,] "Z" "ae"
[4,] "A" "aa"
[5,] "J" "ab"
How can I append the values to data 1 such that they are sorted according to the different order in data 1?
You can get this behavior with the match function:
dat1 = data.frame(data1=c("M", "T", "Z", "A", "J"), stringsAsFactors=FALSE)
dat2 = data.frame(data2=c("A", "J", "M", "T", "Z"),
values=c("aa", "ab", "ac", "ad", "ae"), stringsAsFactors=FALSE)
dat2[match(dat1$data1, dat2$data2),]
# data2 values
# 3 M ac
# 4 T ad
# 5 Z ae
# 1 A aa
# 2 J ab

Creating data frame without duplicates in one column but could have duplicates in others

I have a problem creating a matrix when my data frame contains duplicates on both columns
Example
n = c('A', 'B', 'C', 'A', 'B', 'B')
s = c("aa", "bb", "cc","dd","aa","cc")
df = data.frame(n, s)
But using df I need to create something like this:
new data frame (NDF)
A "aa" "dd"
B "bb" "aa" "cc"
C "cc"
As you can see, I used only unique values from column n on my data frame df and the rows are filled with values from df$s, the latest value in this example could be zero or na (right now is empty).
F<-matrix(nrow=length(unique(df$n)),ncol=length(unique(df$s)))
But when I tried to make a loop here (For (i)...For.(j)...) I could not figure it out how to do it./
Any help is more than welcome
Thanks in advance
Not clear what you want since a data.frame has to be rectangular.
Maybe you want this:
tapply(s, n, list)
#$A
#[1] "aa" "dd"
#
#$B
#[1] "bb" "aa" "cc"
#
#$C
#[1] "cc"
You could use dcast function from plyr package to get the following data.frame:
dcast(data=df, n ~ s)
n aa bb cc dd
1 A aa <NA> <NA> dd
2 B aa bb cc <NA>
3 C <NA> <NA> cc <NA>
If you want to have all non-NA values "in front" you need to do more. I've come to the following solution, which isn't pretty at all but works.
x <- dcast(data=df, n ~ s)
t(apply(x ,1 ,function(x){
tmp <- sum(is.na(x))
c(x[complete.cases(x)], rep(NA,tmp))
}))
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "aa" "dd" NA NA
[2,] "B" "aa" "bb" "cc" NA
[3,] "C" "cc" NA NA NA

Creating a full diallel using R

I am relativly new to R, excuse if this question is too basic.
I am wondering whether there is a good and fast way to create a full diallel using R?
I have a matrix that looks likes:
M1 M2 M3
Line1 A B A
Line2 A A B
Line3 B A A
From this matrix I would like to create the following data frame:
X Y M1 M2 M3
Line1 Line1 AA BB AA
Line1 Line2 AA BA AB
Line1 Line3 AB BA AA
Line2 Line1 AA AB BA
Line2 Line2 AA AA BB
Line2 Line3 AB AA BA
Line3 Line1 BA AB AA
Line3 Line2 BA AA AB
Line3 Line3 BB AA AA
I think this might be possible by creating a couple of nested loops and using paste to combine the A and B lettercodes. But probably there are better and more "R-like" options (using cbind()?).
One approach is to think of the indices of the rows of your data that make up each line of the desired output. Using your data:
mat <- matrix(c("A","B","A",
"A","A","B",
"B","A","A"), ncol = 3, byrow = TRUE)
I create those indices using expand.grid(). The first row of your output is formed by the concatenation of row 1 of mat with row 1 of mat, and so on. These indices are produced as follows
> ind <- expand.grid(r1 = 1:3, r2 = 1:3)
> ind
r1 r2
1 1 1
2 2 1
3 3 1
4 1 2
5 2 2
6 3 2
7 1 3
8 2 3
9 3 3
Note that to get what your output shows we need to take columns r2 then r1 rather than the other way round.
Now I just index mat with the second column of ind and the first column of ind and supply that to paste0() the output from which is a vector so we need to reshape it to a matrix.
> matrix(paste0(mat[ind[,2], ], mat[ind[,1], ]), ncol = 3)
[,1] [,2] [,3]
[1,] "AA" "BB" "AA"
[2,] "AA" "BA" "AB"
[3,] "AB" "BA" "AA"
[4,] "AA" "AB" "BA"
[5,] "AA" "AA" "BB"
[6,] "AB" "AA" "BA"
[7,] "BA" "AB" "AA"
[8,] "BA" "AA" "AB"
[9,] "BB" "AA" "AA"
The paste0() step returns a vector of the pasted strings:
> paste0(mat[ind[,2], ], mat[ind[,1], ])
[1] "AA" "AA" "AB" "AA" "AA" "AB" "BA" "BA" "BB" "BB" "BA" "BA" "AB" "AA" "AA"
[16] "AB" "AA" "AA" "AA" "AB" "AA" "BA" "BB" "BA" "AA" "AB" "AA"
The trick as to why the matrix restructuring shown above works is to note that the entries in the output from paste0() are in column-major order because of how the index ind was formed. Essentially the two arguments passed to paste0() are:
> mat[ind[,2], ]
[,1] [,2] [,3]
[1,] "A" "B" "A"
[2,] "A" "B" "A"
[3,] "A" "B" "A"
[4,] "A" "A" "B"
[5,] "A" "A" "B"
[6,] "A" "A" "B"
[7,] "B" "A" "A"
[8,] "B" "A" "A"
[9,] "B" "A" "A"
> mat[ind[,1], ]
[,1] [,2] [,3]
[1,] "A" "B" "A"
[2,] "A" "A" "B"
[3,] "B" "A" "A"
[4,] "A" "B" "A"
[5,] "A" "A" "B"
[6,] "B" "A" "A"
[7,] "A" "B" "A"
[8,] "A" "A" "B"
[9,] "B" "A" "A"
R treats each as a vector and hence the output is a vector, but because R stores matrices by columns, we fill our output matrix with the pasted strings by columns also.
You might not need a couple of loops to get your output, here is a suggestion:
To start with, let's generate your sample matrix:
M <- matrix(c("A","B","A","A","A","B","B","A","A"), ncol = 3, byrow = TRUE)
rownames(M) <- c("Line1","Line2","Line3")
colnames(M) <- c("M1","M2","M3")
An easy to generate all possible pairs between items in a vector is to use expand.grid():
d <- expand.grid(rownames(M), rownames(M))
Generates the columns X and Y in your desired output:
Var1 Var2
1 Line1 Line1
2 Line2 Line1
3 Line3 Line1
4 Line1 Line2
5 Line2 Line2
6 Line3 Line2
7 Line1 Line3
8 Line2 Line3
9 Line3 Line3
Then, what you could do is to apply() a function to each row that pastes together the corresponding M1,M2,M3 values:
apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} )
It will generate the right combinations, but not with the right format (yet):
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] "AA" "AA" "BA" "AA" "AA" "BA" "AB" "AB" "BB"
[2,] "BB" "AB" "AB" "BA" "AA" "AA" "BA" "AA" "AA"
[3,] "AA" "BA" "AA" "AB" "BB" "AB" "AA" "BA" "AA"
To flip the matrix in the right direction, you simply have to transpose it.
From there, you can wrap everything into a data frame, in one go:
df <- data.frame( d, t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
colnames(df) <- c("X","Y","M1","M2", "M3")
and here it is.
To be more efficient, you can finally write a little function to which you submit any M matrix.
get.it <- function(M){
d <- expand.grid(rownames(M), rownames(M))
e <- t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
output<- data.frame( d, e)
colnames(output) <- c("X","Y","M1","M2","M3")
return(output)
}
and get.it(M) should work!

Resources