I am having very basic doubt in R.
I am having a table like this:
A B C D E
7 1 6 8 7
9 3 9 5 9
4 6 2 1 10
10 5 3 4 1
1 3 5 9 3
6 4 8 7 6
I am in the process of finding correlation of each variable with every other variable in the table. The final report should be something like this:
Var_1 Var_2 Correlation
A A 1
A B -0.022991544
A C 0.231553
A D -0.28037
A E -0.00523
B A -0.022999
B B 1
…
…
E D -0.39223
E E 1
The below is the R code i am using to achieve this:
rm(list=ls())
test <- read.csv("D:/AB/test.csv")
iterations <- ncol(test)
correlation <- matrix(ncol = 3 , nrow = iterations)
for (k in 1:iterations) {
for (l in 1:iterations){
corr <- cor(test[,k], test[,l])
corr_string_A <- names(test[k])
corr_string_B <- names(test[l])
correlation[l,] <- rbind(corr_string_A, corr_string_B, corr)
}
}
But i am ending up getting only the output of E variables:
> correlation
[,1] [,2] [,3]
[1,] "E" "A" "-0.0523026032815805"
[2,] "E" "B" "0"
[3,] "E" "C" "0.231900361745681"
[4,] "E" "D" "-0.392232270276368"
[5,] "E" "E" "1"
I understand that somewhere in the twin For loops that is used in the above code has a looping issue and hence only the "E" series is printed. I am not able to figure it out.
If anyone can kindly help me, it would be really great.
EDIT*
Changing the input data a bit
A B C D E
0 0 6 8 7
0 0 9 5 9
0 0 2 1 10
0 0 3 4 1
0 0 5 9 3
0 0 8 7 6
If one of the columns are having 0, the correlation value that we will get would be 'NaN'. I want to handle 'NaN', replace with some other value according the business specification. Sorry for the late addition. Thank you for your understanding.
To answer your question without altering your code too much, there are two main issues. First, you are not allocating a matrix of the correct size. There are five interations of five variables, or 25 combinations (with doubling of some combinations, ie A/C = C/A) in this example, so you need to fix your matrix declaration to account for that:
correlation <- matrix(ncol = 3 , nrow = iterations * iterations)
Second, you are only assigning values to the first five columns of this matrix within your nested for loop. This line:
correlation[l,] <- rbind(corr_string_A, corr_string_B, corr)
Needs to have a value greater than l (which can only reach 5 in the example) after the first time through the nested loop, like this:
correlation[l + ((k-1) * iterations),] <- rbind(corr_string_A, corr_string_B, corr)
This code should fix those problems:
iterations <- ncol(test)
correlation <- matrix(ncol = 3 , nrow = iterations * iterations)
for (k in 1:iterations) {
for (l in 1:iterations){
corr <- cor(test[,k], test[,l])
corr_string_A <- names(test[k])
corr_string_B <- names(test[l])
correlation[l + ((k-1) * iterations),] <- rbind(corr_string_A, corr_string_B, corr)
}
}
The Hmisc package has an rcorr function that will return a list whose first item is the correlation matrix. It requires a matrix as input, which the function data.matrix is designed to deliver. The transformation to a three column format is accomplished by the as.data.frame.table function:
library(Hmisc)
as.data.frame.table( rcorr(data.matrix(dat))[[1]] )
#-------
Var1 Var2 Freq
1 A A 1.00000000
2 B A -0.02299154
3 C A 0.23155349
4 D A -0.28036851
5 E A -0.05230260
6 A B -0.02299154
7 B B 1.00000000
8 C B -0.58384037
9 D B -0.80175394
10 E B 0.00000000
11 A C 0.23155349
12 B C -0.58384037
13 C C 1.00000000
14 D C 0.52094591
15 E C 0.23190036
16 A D -0.28036851
17 B D -0.80175394
18 C D 0.52094591
19 D D 1.00000000
20 E D -0.39223227
21 A E -0.05230260
22 B E 0.00000000
23 C E 0.23190036
24 D E -0.39223227
25 E E 1.00000000
The names<- function can be used to dress up column names to your specification.
Related
I have two sets of data frames and I am trying to create a function which takes in a data frame and row name as an argument and returns the three highest value on the row (in a descending order) and the name of the column of three highest value.
set.seed(0)
df <- data.frame(A=c(3,2,1,4,5),B=c(1,6,3,8,4),C=c(2,1,4,8,9), D=c(4,1,2,4,6))
row.names(df)<-c("R1","R2","R3","R4","R5")
df2 <- data.frame(E=c(2,5,6,1,4),F=c(2,4,2,5,1),G=c(5,6,2,7,3),H=c(8,2,7,4,1))
row.names(df2)<-c("R6","R7","R8","R9","R10")
print(df)
A B C D
R1 3 1 2 4
R2 2 6 1 1
R3 1 3 4 2
R4 4 8 8 4
R5 5 4 9 6
print(df2)
E F G H
R6 2 2 5 8
R7 5 4 6 2
R8 6 2 2 7
R9 1 5 7 4
R10 4 1 3 1
Here is an example of a result:
Let the function be maxthree. Now
maxthree(df2, "R7")
G E F
6 5 4
Here is what I have done so far:
maxthree <- function(data,row) {
if(!row %in% rownames(data)) {
print("Check value")
} else {
max_col <- which.max(data[row,])
print(max_col)
}
}
This function will now return the maximum value in that row as well as the column name. However, I don't now how to add the second and the third highest values to the function.
maxthree = function(data, row) {
data[row, order(unlist(data[row, ]), decreasing = TRUE)[1:3]]
}
maxthree(df2, "R7")
# G E F
# R7 6 5 4
The result is a 1x3 data frame.
This should work great
maxthree <- function(data,roww){
x <- data[roww,]
x[order(x, decreasing = T)][1:3]
}
> maxthree(df2, "R7")
G E F
R7 6 5 4
Try this:
df <- data.frame(A=c(3,2,1,4,5),B=c(1,6,3,8,4),C=c(2,1,4,8,9), D=c(4,1,2,4,6))
row.names(df)<-c("R1","R2","R3","R4","R5")
df2 <- data.frame(E=c(2,5,6,1,4),F=c(2,4,2,5,1),G=c(5,6,2,7,3),H=c(8,2,7,4,1))
row.names(df2)<-c("R6","R7","R8","R9","R10")
maxthree <- function(data,row) {
named_vec <- t(data)[,row]
return(sort(named_vec, decreasing = T)[1:3])
}
maxthree(df2, "R7")
# G E F
# 6 5 4
This approach transposes your data frame "t()" to allow a straightforward subset of the row as a named vector. This allows sort to be used to order the values as desired.
You can use sort and [1:3] to get the first 3 elements like:
maxthree <- function(data,row) {sort(data[row,], TRUE)[1:3]}
maxthree(df2, "R7")
# G E F
#R7 6 5 4
In case the rowname should not be shown you can add unlist:
maxthree <- function(data,row) {head(unlist(sort(data[row,], TRUE)),3)}
maxthree(df2, "R7")
#G E F
#6 5 4
You can use the order function.
maxthree <- function(data, row_name) data[row_name, order(-data[row_name,])][, 1:3]
maxthree(df2, 'R7')
G E F
R7 6 5 4
In R, I try systematically to avoid "for" loops and use lapply() family instead.
But how to do so when an iteration contains an increment step ?
For example : is it possible to obtain the same result as below with a lapply approach ?
a <- c()
b <- c()
set.seed(1L) # required for reproducible data
for (i in 1:10){
a <- c(a, sample(c(0,1), 1))
b <- c(b, (paste(a, collapse = "-")))
}
data.frame(a, b)
> data.frame(a, b)
> a b
> 1 0 0
> 2 1 0-1
> 3 0 0-1-0
> 4 0 0-1-0-0
> 5 1 0-1-0-0-1
> 6 0 0-1-0-0-1-0
> 7 0 0-1-0-0-1-0-0
> 8 0 0-1-0-0-1-0-0-0
> 9 1 0-1-0-0-1-0-0-0-1
> 10 1 0-1-0-0-1-0-0-0-1-1
EDIT
My question was very badly redacted. The below new example is much more illustrative : is it anyway to use lapply family if each iteration is calculated from the previous one ?
a <- c()
b <- c()
for (i in 1:10){
a <- c(a, sample(c(0,1), 1))
b <- c(b, (paste(a, collapse = "-")))
}
data.frame(a, b)
> data.frame(a, b)
a b
1 0 0
2 1 0-1
3 0 0-1-0
4 1 0-1-0-1
5 1 0-1-0-1-1
6 1 0-1-0-1-1-1
7 1 0-1-0-1-1-1-1
8 0 0-1-0-1-1-1-1-0
9 1 0-1-0-1-1-1-1-0-1
10 1 0-1-0-1-1-1-1-0-1-1
For the sake of completeness, there is also the accumulate() function from the purrr package.
So, building on the answers of Sotos and ThomasIsCoding:
df <- data.frame(a = 1:10)
df$b <- purrr::accumulate(df$a, paste, sep = "-")
df
a b
1 1 1
2 2 1-2
3 3 1-2-3
4 4 1-2-3-4
5 5 1-2-3-4-5
6 6 1-2-3-4-5-6
7 7 1-2-3-4-5-6-7
8 8 1-2-3-4-5-6-7-8
9 9 1-2-3-4-5-6-7-8-9
10 10 1-2-3-4-5-6-7-8-9-10
The difference to Reduce() is
that accumulate() is a function verb on its own (no additional parameter accumulate = TRUE required)
and that additional arguments like sep = "-" can be passed on to the mapped function which may help to avoid the creation of an anonymous function.
EDIT
If I understand correctly OP's edit of the question, the OP is asking if a for loop which computes a result iteratively can be replaced by lapply().
This is difficult to answer for me. Here are some thoughts and observations:
First, accumulate() still will work:
set.seed(1L) # required for reproducible data
df <- data.frame(a = sample(0:1, 10L, TRUE))
df$b <- purrr::accumulate(df$a, paste, sep = "-")
df
a b
1 0 0
2 1 0-1
3 0 0-1-0
4 0 0-1-0-0
5 1 0-1-0-0-1
6 0 0-1-0-0-1-0
7 0 0-1-0-0-1-0-0
8 0 0-1-0-0-1-0-0-0
9 1 0-1-0-0-1-0-0-0-1
10 1 0-1-0-0-1-0-0-0-1-1
This is possible because the computation of a can be pulled out off the loop as it does not depend on b.
IMHO, accumulate() and Reduce() do what the OP is looking for but is not called lapply(): They take the result of the previous iteration and combine it with the actual value, for instance
Reduce(`+`, 1:3)
returns the sum of 1, 2, and 3 by iteratively computing (((0 + 1) + 2) + 3). This can be visualised by using the accumulate parameter
Reduce(`+`, 1:3, accumulate = TRUE)
[1] 1 3 6
Second, there is a major difference between a for loop and functions of the lapply() family: lapply(X, FUN, ...) requires a function FUN to be called on each element of X. So, scoping rules for functions apply.
When we transplant the body of the loop into an anonymous function within lapply()
a <- c()
b <- c()
set.seed(1L) # required for reproducible data
lapply(1:10, function(i) {
a <- c(a, sample(c(0,1), 1))
b <- c(b, (paste(a, collapse = "-")))
})
we get
[[1]]
[1] "0"
[[2]]
[1] "1"
[[3]]
[1] "0"
[[4]]
[1] "0"
[[5]]
[1] "1"
[[6]]
[1] "0"
[[7]]
[1] "0"
[[8]]
[1] "0"
[[9]]
[1] "1"
[[10]]
[1] "1"
data.frame(a, b)
data frame with 0 columns and 0 rows data.frame(a, b)
Due to the scoping rules, a and b inside the function are considered as local to the function. No reference is made to a and b defined outside of the function.
This can be fixed by global assignment using the global assignment operator <<-:
a <- c()
b <- c()
set.seed(1L) # required for reproducible data
lapply(1:10, function(i) {
a <<- c(a, sample(c(0,1), 1))
b <<- c(b, (paste(a, collapse = "-")))
})
data.frame(a, b)
a b
1 0 0
2 1 0-1
3 0 0-1-0
4 0 0-1-0-0
5 1 0-1-0-0-1
6 0 0-1-0-0-1-0
7 0 0-1-0-0-1-0-0
8 0 0-1-0-0-1-0-0-0
9 1 0-1-0-0-1-0-0-0-1
10 1 0-1-0-0-1-0-0-0-1-1
However, global assignment is considered bad programming practice and should be avoided, see, e.g., the 6th Circle of Patrick Burns' The R Inferno and many questions on SO.
Third, the way the loop is written grows vectors in the loop. This also is considered bad practice as it requires to copy the data over and over again which may slow down tremendously with increasing size. See, e.g., the 2nd Circle of Patrick Burns' The R Inferno.
However, the original code
a <- c()
b <- c()
set.seed(1L) # required for reproducible data
for (i in 1:10) {
a <- c(a, sample(c(0,1), 1))
b <- c(b, (paste(a, collapse = "-")))
}
data.frame(a, b)
can be re-written as
a <- integer(10)
b <- character(10)
set.seed(1L) # required for reproducible data
for (i in seq_along(a)) {
a[i] <- sample(c(0,1), 1)
b[i] <- if (i == 1L) a[1] else paste(b[i-1], a[i], sep = "-")
}
data.frame(a, b)
Here, vectors are pre-allocated with the required size to hold the result. Elements to update are identified by subscripting.
Calculation of b[i] still depends only the value of the previous iteration b[i-1] and the actual value a[i] as requested by the OP.
Another way is to use Reduce with accumulate = TRUE, i.e.
df$new <- do.call(rbind, Reduce(paste, split(df, seq(nrow(df))), accumulate = TRUE))
which gives,
a new
1 1 1
2 2 1 2
3 3 1 2 3
4 4 1 2 3 4
5 5 1 2 3 4 5
6 6 1 2 3 4 5 6
7 7 1 2 3 4 5 6 7
8 8 1 2 3 4 5 6 7 8
9 9 1 2 3 4 5 6 7 8 9
10 10 1 2 3 4 5 6 7 8 9 10
You can use sapply (lapply would work too but it returns a list) and iterate over every value of a in df and create a sequence and paste the value together.
df <- data.frame(a = 1:10)
df$b <- sapply(df$a, function(x) paste(seq(x), collapse = "-"))
df
# a b
#1 1 1
#2 2 1-2
#3 3 1-2-3
#4 4 1-2-3-4
#5 5 1-2-3-4-5
#6 6 1-2-3-4-5-6
#7 7 1-2-3-4-5-6-7
#8 8 1-2-3-4-5-6-7-8
#9 9 1-2-3-4-5-6-7-8-9
#10 10 1-2-3-4-5-6-7-8-9-10
If there could be non-numerical values in data on which we can not use seq like
df <- data.frame(a =letters[1:10])
In those case, we can use
df$b <- sapply(seq_along(df$a), function(x) paste(df$a[seq_len(x)], collapse = "-"))
df
# a b
#1 a a
#2 b a-b
#3 c a-b-c
#4 d a-b-c-d
#5 e a-b-c-d-e
#6 f a-b-c-d-e-f
#7 g a-b-c-d-e-f-g
#8 h a-b-c-d-e-f-g-h
#9 i a-b-c-d-e-f-g-h-i
#10 j a-b-c-d-e-f-g-h-i-j
Another way of using Reduce, different to the approach by #Sotos
df$b <- Reduce(function(...) paste(...,sep = "-"), df$a, accumulate = T)
such that
> df
a b
1 1 1
2 2 1-2
3 3 1-2-3
4 4 1-2-3-4
5 5 1-2-3-4-5
6 6 1-2-3-4-5-6
7 7 1-2-3-4-5-6-7
8 8 1-2-3-4-5-6-7-8
9 9 1-2-3-4-5-6-7-8-9
10 10 1-2-3-4-5-6-7-8-9-10
I have a table like this:
A B C D E
7 1 6 8 7
9 3 9 5 9
4 6 2 1 10
10 5 3 4 1
1 3 5 9 3
6 4 8 7 6
I am in the process of finding the correlation of each variable with every other variable in the table. This is the R code I use:
test <- read.csv("D:/AB/test.csv")
iterations <- ncol(test)
correlation <- matrix(ncol = 3 , nrow = iterations * iterations)
for (k in 1:iterations) {
for (l in 1:iterations){
corr <- cor(test[,k], test[,l])
corr_string_A <- names(test[k])
corr_string_B <- names(test[l])
correlation[l + ((k-1) * iterations),] <- rbind(corr_string_A, corr_string_B, corr)
}
}
The following is the output that I received:
Var1 Var2 value
1 A A 1.00000000
2 B A 0.50018605
3 C A -0.35747393
4 D A -0.25670054
5 E A -0.02974821
6 A B 0.50018605
7 B B 1.00000000
8 C B 0.56070716
9 D B 0.46164928
10 E B 0.16813991
11 A C -0.35747393
12 B C 0.56070716
13 C C 1.00000000
14 D C 0.52094589
15 E C 0.23190036
16 A D -0.25670054
17 B D 0.46164928
18 C D 0.52094589
19 D D 1.00000000
20 E D -0.39223227
21 A E -0.02974821
22 B E 0.16813991
23 C E 0.23190036
24 D E -0.39223227
25 E E 1.00000000
However, I don't want the values from the upper triangle; i.e., no diagonal values should occur, and each unique combination should appear only once. The final output should look like:
Var1 Var2 value
1 B A 0.50018605
2 C A -0.35747393
3 D A -0.25670054
4 E A -0.02974821
5 C B 0.56070716
6 D B 0.46164928
7 E B 0.16813991
8 D C 0.52094589
9 E C 0.23190036
10 E D -0.39223227
I understand that there are a few techniques like reshape using which the above output can be achieved, but I want to make the above R code to suit and produce the above mentioned results.
I believe the "n" in the second for loop should be made to change dynamically which can help achieving this. However I am not sure how to make this work.
You can convert your correlation matrix to the 3-column format with as.data.frame and as.table, and then limiting to values above or below the diagonal can be done with subset.
subset(as.data.frame(as.table(cor(dat))),
match(Var1, names(dat)) > match(Var2, names(dat)))
# Var1 Var2 Freq
# 2 B A -0.02299154
# 3 C A 0.23155350
# 4 D A -0.28036851
# 5 E A -0.05230260
# 8 C B -0.58384036
# 9 D B -0.80175393
# 10 E B 0.00000000
# 14 D C 0.52094589
# 15 E C 0.23190036
# 20 E D -0.39223227
Note that for larger datasets this should be much more efficient than separately calling cor on pairs of variables because cor is vectorized, and further it's clearly a lot less typing.
If you really must keep the looping code, then you can achieve your desired result with small changes to the pair of for loops and some book keeping about the row of correlation that you are computing:
iterations <- ncol(test)
correlation <- matrix(ncol = 3 , nrow = choose(iterations, 2))
pos <- 1
for (k in 2:iterations) {
for (l in 1:(k-1)){
corr <- cor(test[,k], test[,l])
corr_string_A <- names(test[k])
corr_string_B <- names(test[l])
correlation[pos,] <- rbind(corr_string_A, corr_string_B, corr)
pos <- pos+1
}
}
However I really wouldn't suggest this looping solution; it would be better to use the one-liner I provided and then to handle all generated NA values afterward.
From the OP's loop output, we can subset the rows,
df1[!duplicated(t(apply(df1[1:2], 1, sort))) & df1[,1]!=df1[,2],]
# Var1 Var2 value
#2 B A 0.50018605
#3 C A -0.35747393
#4 D A -0.25670054
#5 E A -0.02974821
#8 C B 0.56070716
#9 D B 0.46164928
#10 E B 0.16813991
#14 D C 0.52094589
#15 E C 0.23190036
#20 E D -0.39223227
Or as I mentioned (first) in the comments, just use
cor(test)
Let me try to make this question as general as possible.
Let's say I have two variables a and b.
a <- as.integer(runif(20, min = 0, max = 10))
a <- as.data.frame(a)
b <- as.data.frame(a[c(-7, -11, -15),])
So b has 17 observations and is a subset of a which has 20 observations.
My question is the following: how I would use these two variables to generate a third variable c which like a has 20 observations but for which observations 7, 11 and 15 are missing, and for which the other observations are identical to b but in the order of a?
Or to put it somewhat differently: how could I squeeze in these missing observations into variable b at locations 7, 11 and 15?
It seems pretty straightforward (and it probably is) but I have been not getting this to work for a bit too long now.
1) loop Try this loop:
# test data
set.seed(123) # for reproducibility
a <- as.integer(runif(20, min = 0, max = 10))
a <- as.data.frame(a)
b <- as.data.frame(a[c(-7, -11, -15),])
# lets work with vectors
A <- a[[1]]
B <- b[[1]]
j <- 1
C <- A
for(i in seq_along(A)) if (A[i] == B[j]) j <- j+1 else C[i] <- NA
which gives:
> C
[1] 2 7 4 8 9 0 NA 8 5 4 NA 4 6 5 NA 8 2 0 3 9
2) Reduce Here is a loop-free version:
f <- function(j, a) j + (a == B[j])
r <- Reduce(f, A, acc = TRUE)
ifelse(duplicated(r), NA, A)
giving:
[1] 2 7 4 8 9 0 NA 8 5 4 NA 4 6 5 NA 8 2 0 3 9
3) dtw. Using dtw in the package of the same name we can get a compact loop-free one-liner:
library(dtw)
ifelse(duplicated(dtw(A, B)$index2), NA, A)
giving:
[1] 2 7 4 8 9 0 NA 8 5 4 NA 4 6 5 NA 8 2 0 3 9
REVISED Added additional solutions.
Here's a more complicated way of doing it, using the Levenshtein distance algorithm, that does a better job on more complicated examples (it also seemed faster in a couple of larger tests I tried):
# using same data as G. Grothendieck:
set.seed(123) # for reproducibility
a <- as.integer(runif(20, min = 0, max = 10))
a <- as.data.frame(a)
b <- as.data.frame(a[c(-7, -11, -15),])
A = a[[1]]
B = b[[1]]
# compute the transformation between the two, assigning infinite weight to
# insertion and substitution
# using +1 here because the integers fed to intToUtf8 have to be larger than 0
# could also adjust the range more dynamically based on A and B
transf = attr(adist(intToUtf8(A+1), intToUtf8(B+1),
costs = c(Inf,1,Inf), counts = TRUE), 'trafos')
C = A
C[substring(transf, 1:nchar(transf), 1:nchar(transf)) == "D"] <- NA
#[1] 2 7 4 8 9 0 NA 8 5 4 NA 4 6 5 NA 8 2 0 3 9
More complex matching example (where the greedy algorithm would perform poorly):
A = c(1,1,2,2,1,1,1,2,2,2)
B = c(1,1,1,2,2,2)
transf = attr(adist(intToUtf8(A), intToUtf8(B),
costs = c(Inf,1,Inf), counts = TRUE), 'trafos')
C = A
C[substring(transf, 1:nchar(transf), 1:nchar(transf)) == "D"] <- NA
#[1] NA NA NA NA 1 1 1 2 2 2
# the greedy algorithm would return this instead:
#[1] 1 1 NA NA 1 NA NA 2 2 2
The data frame version, which isn't terribly different from G.'s above.
(Assumes a,b setup as above).
j <- 1
c <- a
for (i in (seq_along(a[,1]))) {
if (a[i,1]==b[j,1]) {
j <- j+1
} else
{
c[i,1] <- NA
}
}
I have a matrix in R. Each entry i,j is a score and the rownames and colnames are ids.
Instead of the matrix I just want a 3 column matrix that has: i,j,score
Right now I'm using nested for loops. Like:
for(i in rownames(g))
{
print(which(rownames(g)==i))
for(j in colnames(g))
{
cur.vector<-c(cur.ref, i, j, g[rownames(g) %in% i,colnames(g) %in% j])
rbind(new.file,cur.vector)->new.file
}
}
But thats very inefficient I think...I'm sure there's a better way I'm just not good enough with R yet.
Thoughts?
If I understand you correctly, you need to flatten the matrix.
You can use as.vector and rep to add the id columns e.g. :
m = cbind(c(1,2,3),c(4,5,6),c(7,8,9))
row.names(m) = c('R1','R2','R3')
colnames(m) = c('C1','C2','C3')
d <- data.frame(i=rep(row.names(m),ncol(m)),
j=rep(colnames(m),each=nrow(m)),
score=as.vector(m))
Result:
> m
C1 C2 C3
R1 1 4 7
R2 2 5 8
R3 3 6 9
> d
i j score
1 R1 C1 1
2 R2 C1 2
3 R3 C1 3
4 R1 C2 4
5 R2 C2 5
6 R3 C2 6
7 R1 C3 7
8 R2 C3 8
9 R3 C3 9
Please, note that this code converts a matrix into a data.frame, since the row and col names can be string and you can't have a matrix with different column type.
If you are sure that all row and col names are numbers, you can coerced it to a matrix.
If you convert your matrix first to a table (with as.table) then to a data frame (as.data.frame) then it will accomplish what you are asking for. A simple example:
> tmp <- matrix( 1:12, 3 )
> dimnames(tmp) <- list( letters[1:3], LETTERS[4:7] )
> as.data.frame( as.table( tmp ) )
Var1 Var2 Freq
1 a D 1
2 b D 2
3 c D 3
4 a E 4
5 b E 5
6 c E 6
7 a F 7
8 b F 8
9 c F 9
10 a G 10
11 b G 11
12 c G 12