create a frequency matrix from a 4 dim matrix in R - r

I have a matrix in k*4 dimension that each row is one of a combination of (1:20,1:20,1:20,1:20) and specify type of quadruplet node . For example for k=3 I have 3 tetrahedron that type of node is here
X <- matrix(c(1, 3, 1 ,4,
2, 5, 6 ,1,
12,20,15 ,3), 3,4,byrow=T)
Now I want to create a frequency table in dim 20*8000 from it that record the frequency of each node in contact with the three remaining node. On the other hand I want to know that each node in quadruplet is in contact with which type of node.
For example for the first row I have a one in 1,(1,3,4)th of F and also in 3,(1,1,4) and in 4,(1,1,3).
I hope that I could explain my problem good to understand.
Please help me in code of this conversion
Note:
As the first row of my X matrix is 1,3,1,4 the output matrix(F) should record a one in the
F[1,which(colnames(F)=="1 3 4") <- F[1,which(colnames(F)=="1 3 4") +1
F[1,which(colnames(F)=="1 3 4") <- F[1,which(colnames(F)=="1 3 4") +1
F[3,which(colnames(F)=="1 1 4") <- F[3,which(colnames(F)=="1 1 4") +1
F[4,which(colnames(F)=="1 1 3") <- F[4,which(colnames(F)=="1 1 3")+1
It means that each row add 4 ones to the frequency matrix in the 4 row of it and it may be the same for 2,3 or 4 of it. For example because of one is repeated in the row one, adds two records to F[1,which(colnames(F)=="1 3 4")

I'm not sure I understand, and if I do, then you are not doing this correctly because you are not properly ordering your triplets, so this is a guess. I'm thinking the vector c(3,1,4) should be different than the vector c(1,3,4). Correct me if I'm wrong about that.
I thought trying to work with a 20^4 array was rather excessive so I constructed an input matrix that would fit in a 5^4 array:
X <- matrix(c(1, 3, 1 ,4,
2, 5, 2 ,1,
3, 2, 5 ,4), 3,4, byrow=T)
We produce the combinations of 4 items taken three at a time from each row and arrange it in column major fashion:
array( apply( X, 1, function(x) combn(x, 3) ), dim=c(3,4,3) )
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 1 1 3
[2,] 3 3 1 1
[3,] 1 4 4 4
, , 2
[,1] [,2] [,3] [,4]
[1,] 2 2 2 5
[2,] 5 5 2 2
[3,] 2 1 1 1
, , 3
[,1] [,2] [,3] [,4]
[1,] 3 3 3 2
[2,] 2 2 5 5
[3,] 5 4 4 4

I found an elementary answer to my question. But I think it's not quick as well as it should be.
For example I have a matrix in dim (3*4) and for simplicity I suppose I have 5 type only.
so to find the frequncy table for this situation I write the below codes:
n <- 5
k <- dim(X)[1]
F <- matrix(0,n,n^3)
colnames(F) <- simplify2array(apply(expand.grid(1:n,1:n,1:n ), 1, paste, collapse =" ", sep = ""))
for(i in 1:k)
{
for(j in 1:4){
per <- simplify2array(permn(X[i,-j]))
pert_charac <- apply(per,2,paste,sep="",collapse=" ")
num <- sapply(pert_charac,f <- function(x) which(colnames(F)==x))
F[X[i,j],num] <- F[X[i,j],num]+1
}
}

Related

Subset assignment of multidimensional array in R

I am trying to assign rows of a 3D array, but I don't know how excatly.
I have a 2D index array where each row corresponds to the first and second index of the 3D array, and a 2D value array which i want to insert into the 3D array. The simplest way I found to do this was
indexes <- cbind(1:30, rep(c(1, 2), 15))
rows <- cbind(1:20, 31:50, 71:90)
for (i in 1:nrow(indexes)) for (j in 1:3)
data[indexes[i,1], indexes[i,2], j] <- rows[i, j]
But this is hard to read, because it uses nested indexing, so I was hoping there was a simpler way, like
data[indexes,] <- rows
(this does not work)
What I've tried:
this question shows how to index the array (without assignment)
apply(data, 3, `[`, indexes)
but this doesn't allow assignment
apply(data, 3, `[`, indexes) <- rows #: could not find function "apply<-"
nor does using [<- work:
apply(data, 3, `[<-`, indexes, rows)
because it treats rows as a vector.
Neither of the following works either
data[indexes[1], indexes[2],] <- rows #: subscript out of bounds
data[indexes,] <- rows #: incorrect number of subscripts on matrix
So is there a simpler way of assigning to a multidimensional array?
Your indexes variable implies that data has first dim of 30, but rows[30,j] doesn't exist. So your problem isn't well posed, and I'll change it.
The basic idea is that you can index a 3 way array by an n x 3 matrix. Each row of the matrix corresponds to a location in the 3 way array, so if you want to set entry data[1,2,3] to 4, and entry data[5,6,7] to 8, you'd use
index <- rbind(c(1,2,3), c(5,6,7))
data[index] <- c(4,8)
You will need to expand your indexes variable to replicate each row 3 times, then read the rows matrix as a vector, and then this works:
data <- array(NA, dim=c(30, 2, 3))
indexes <- cbind(1:30, rep(c(1, 2), 15))
rows <- cbind(1:30, 31:60, 71:100)
indexes1 <- indexes[rep(1:nrow(indexes), each = 3),]
indexes2 <- cbind(indexes1, 1:3)
data[indexes2] <- t(rows) # Transpose because R reads down columns first
I don't think this is any simpler than what you had with the for loops, but maybe you'll find it preferable.
After reading #user2554330's answer, I found a slightly simpler solution
# initialize as in user2554330's answer
data <- ...
indexes <- ...
rows <- ...
indexes3 <- as.matrix(merge(indexes, 1:3))
data[indexes3] <- rows
comparison of indexes2 and indexes3 (using fewer elements):
# print(indexes2)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 2
[3,] 1 1 3
[4,] 2 2 1
[5,] 2 2 2
[6,] 2 2 3
[7,] 3 1 1
[8,] 3 1 2
[9,] 3 1 3
[10,] 4 2 1
[11,] 4 2 2
[12,] 4 2 3
# print(indexes3)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 1
[3,] 3 1 1
[4,] 4 2 1
[5,] 1 1 2
[6,] 2 2 2
[7,] 3 1 2
[8,] 4 2 2
[9,] 1 1 3
[10,] 2 2 3
[11,] 3 1 3
[12,] 4 2 3

R: How do I translate solve_LSAP into a dataframe?

I'm trying to do a large assignment problem, and am using LSAP. It works, but I am trying to get the output into a dataframe so that I can do more with it. However, the documentation on the function says "An object of class "solve_LSAP" with the optimal assignment of rows to columns", with no further information on the data. I cannot seem to crack open the class to break the data out into a more usable form.
I've provided their example code.
x <- matrix(c(5, 1, 4, 3, 5, 2, 2, 4, 4), nrow = 3)
y <- solve_LSAP(x, maximum = FALSE)
y
Output:
Optimal assignment:
1 => 3, 2 => 1, 3 => 2
I have 200+ assignments, and the baseline output is simply not usable for me. How can I translate it into a dataframe, or at least a matrix that looks something like the below?
Row Column
1 3
2 1
3 2
The solve_LSAP returns column indices by each row, therefore comprising all you need for reconstruction:
len <- length(y)
parsedMat <- cbind(
1:len,
as.integer(y)
)
parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
This can be turned into a solved matrix by:
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
You can also turn this into a function that will return both outputs in form of a list, e.g.:
parseClueOutput <- function(x) {
len <- length(x)
parsedMat <- cbind(
1:len,
as.integer(x)
)
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
return(
list(
parsedMat = parsedMat,
solvedMat = solvedMat
)
)
}
And use it as:
parseClueOutput(y)
$parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
$solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
As for the structure, the solve_LSAP is not really a complicated object, it's essentially a numeric vector which you can see with:
is.numeric(y)
[1] TRUE
Or:
str(y)
'solve_LSAP' num [1:3] 3 1 2
You can also turn the solvedMat or parsedMat into a dataframe easily - for example, the parsedMat:
setNames(as.data.frame(parsedMat), c('Row', 'Column'))
Row Column
1 1 3
2 2 1
3 3 2

Swap a negative value in an R data table with the previous column value

I have a data table where I want to swap negative values by assigning them the positive value in the previous row for the same column. for ex:
1 2 3 4
2 -3 -2 3
should be
1 2 3 4
2 2 3 3
Thanks!
Since there are no answers from more experienced guys, here is what I've come up with.
# I'm reconstructing your example:
n <- matrix(c(1, 2, 2, -3, 3, -2, 4, 3), nrow = 2)
n
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 -3 -2 3
changeMat <- function(mat) {
new_mat <- mat
for(i in 1:length(mat))
ifelse(mat[i] < 0, new_mat[i] <- mat[i-1], new_mat[i] <- mat[i])
return(new_mat)
}
changeMat(n)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 2 3 3
I checked that for data.table object dt changeMat(as.matrix(dt)) would work properly.
Anyway, I am pretty sure that there must be smarter way...

How to use data from every row to create a matrix by loop

I have a data frame like
df<-data.frame(a=c(1,2,3),b=c(4,5,6),c=c(7,8,9),d=c(10,11,12))
a b c d
1 1 4 7 10
2 2 5 8 11
3 3 6 9 12
I want to use every row to create 3 (nrow(df)) 2*2 matrixes. 1st use 1,4,7,10, 2nd use 2,5,8,11, 3rd use 3,6,9,12. So that I can get 3 matrixes. Thank you.
We can use split to split up the dataset into list and use matrix
lapply(split.default(as.matrix(df), row(df)), matrix, 2)
If we need the matrix columns to be 1, 7 followed by 4, 10, use the byrow=TRUE
lapply(split.default(as.matrix(df), row(df)), matrix, 2, byrow=TRUE)
Or use apply with MARGIN = 1 and wrap it with list to get a list output
do.call("c", apply(df, 1, function(x) list(matrix(x, ncol=2))))
If we need a for loop, preassign a as a list with length equal to the number of rows of 'df'
a <- vector("list", nrow(df))
for(i in 1:nrow(df)){ a[[i]] <- matrix(unlist(df[i,]), ncol=2)}
a
Or if it can be stored as array
array(t(df), c(2, 2, 3))
Or using map:
m <- matrix(c(t(df)), ncol = 2, byrow = T)
p <- 2 # number of rows
Map(function(i,j) m[i:j,], seq(1,nrow(m),p), seq(p,nrow(m),p))
# [[1]]
# [,1] [,2]
# [1,] 1 4
# [2,] 7 10
# [[2]]
# [,1] [,2]
# [1,] 2 5
# [2,] 8 11
# [[3]]
# [,1] [,2]
# [1,] 3 6
# [2,] 9 12

Extracting a row from a data frame in R

Let's say we have a matrix something like this:
> A = matrix(
+ c(2, 4, 3, 1, 5, 7), # the data elements
+ nrow=2, # number of rows
+ ncol=3, # number of columns
+ byrow = TRUE) # fill matrix by rows
> A # print the matrix
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
Now, I just used this small example, but imagine if the matrix was much bigger, like 200 rows and 5 columns etc. What I want to do, is to get the minimum value from column 3, and extract that row. In other words, find and get the row where is the 3rd attribute the lowest in the entire column of that data frame.
dataToReturn <- which(A== min(A[, 3])
but this doesn't work.
Another way is to use which.min
A[which.min(A[, 3]), ]
##[1] 2 4 3
You can do this with a simple subsetting via [] and min:
A[A[,3] == min(A[,3]),]
[1] 2 4 3
This reads: Return those row(s) of A where the value of column 3 equals the minimum of column 3 of A.
If you have a matrix like this:
A <- matrix(c(2,4,3,1,5,7,1,3,3), nrow=3, byrow = T)
> A
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
[3,] 1 3 3
> A[which.min(A[, 3]), ] #returns only the first row with minimum condition
[1] 2 4 3
> A[A[,3] == min(A[,3]),] #returns all rows with minimum condition
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 3 3

Resources