Replace values in one matrix with values from another - r

I am a programming newbie attempting to compare two matrices. In case an element from first column in mat1 matches any element from first column in mat2, then I want that matching element in mat1 to be replaced with the neighboor (same row different column) to the match in mat2.
INPUT:
mat1<-matrix(letters[1:5])
mat2<-cbind(letters[4:8],1:5)
> mat1
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "d"
[5,] "e"
> mat2
[,1] [,2]
[1,] "d" "1"
[2,] "e" "2"
[3,] "f" "3"
[4,] "g" "4"
[5,] "h" "5"
wished OUTPUT:
> mat3
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "1"
[5,] "2"
I have attempted the following without succeeding:
> for(x in mat1){mat3<-ifelse(x==mat2,mat2[which(x==mat2),2],mat1)}
> mat3
[,1] [,2]
[1,] "a" "a"
[2,] "2" "b"
[3,] "c" "c"
[4,] "d" "d"
[5,] "e" "e"
Any advice will be very appreciated. Have spent a whole day without making it work. It doesn't matter to me if the elements are in a matrix or a data frame.
Thanks.

ifelse is vectorized so, we can use it on the whole column. Create the test logical condition in ifelse by checking whether the first column values of 'mat1' is %in% the first column of 'mat2', then , get the index of the corresponding values with match, extract the values of the second column with that index, or else return the first column of 'mat1'
mat3 <- matrix(ifelse(mat1[,1] %in% mat2[,1],
mat2[,2][match(mat1[,1], mat2[,1])], mat1[,1]))
mat3
# [,1]
#[1,] "a"
#[2,] "b"
#[3,] "c"
#[4,] "1"
#[5,] "2"

Here is another base R solution
v <- `names<-`(mat2[,2],mat2[,1])
mat3 <- matrix(unname(ifelse(is.na(v[mat1]),mat1,v[mat1])))
which gives
> mat3
[,1]
[1,] "a"
[2,] "b"
[3,] "c"
[4,] "1"
[5,] "2"

An option just using logical operation rather than a function
mat3 <- mat1
mat3[mat1[,1] %in% mat2[,1], 1] <- mat2[mat2[,1] %in% mat1[,1], 2]
Subsetting the values to find those that occur in both and replacing them where they do

Related

Comparing rows of matrix and replacing matching elements

I want to compare two matrices. If row elements in the first matrix matches row elements in the second matrix, then I want the rows in the second matrix to be kept. If the rows do not match, then I want those rows to be to empty. I apologise that I had a quite similar question recently, but I still haven't been able to solve this one.
INPUT:
> mat1<-cbind(letters[3:8])
> mat1
[,1]
[1,] "c"
[2,] "d"
[3,] "e"
[4,] "f"
[5,] "g"
[6,] "h"
> mat2<-cbind(letters[1:5],1:5)
> mat2
[,1] [,2]
[1,] "a" "1"
[2,] "b" "2"
[3,] "c" "3"
[4,] "d" "4"
[5,] "e" "5"
Expected OUTPUT:
> mat3
[,1] [,2]
[1,] "NA" "NA"
[2,] "NA" "NA"
[3,] "c" "3"
[4,] "d" "4"
[5,] "e" "5"
I have unsuccessfully attempted this:
> mat3<-mat2[ifelse(mat2[,1] %in% mat1[,1],mat2,""),]
Error in mat2[ifelse(mat2[, 1] %in% mat1[, 1], mat2, ""), ] :
no 'dimnames' attribute for array
I have been struggling for hours, so any suggestions are welcomed.
You were on the right track, but the answer is a little simpler than what you were trying. mat2[, 1] %in% mat1[, 1] returns the matches as a logical vector, and we can just set the non-matches to NA using that vector as an index.
mat1<-cbind(letters[3:8])
mat2<-cbind(letters[1:5],1:5)
match <- mat2[,1] %in% mat1 # gives a T/F vector of matches
mat3 <- mat2
mat3[!match,] <- NA

Make r ignore the order at which values appear in a column (created from pasting multiple columns)

Given a variable x that can take values A,B,C,D
And three columns for variable x:
df1<-
rbind(c("A","B","C"),c("A","D","C"),c("B","A","C"),c("A","C","B"), c("B","C","A"), c("D","A","B"), c("A","B","D"), c("A","D","C"), c("A",NA,NA),c("D","A",NA),c("A","D",NA))
How do I make column indicating the combination of in the three preceding column such that permutations (ABC, ACB, BAC) would be considered as the same combination of ABC, (AD, DA) would be considered as the same combination of AD?
Pasting the three columns with apply(df1,1,function(x) paste(x[!is.na(x)], collapse=", ")->df1$x4 and using df1%>%group(x4)%>%summarize(c=count(x4)) would count AD,DA as different instead of the same.
Edited title
My desired result would be to get
a<-cbind(c("ABC",4),c("ACD",2),c("ABD",2),c("A",1),c("AD",2))
Someone already solved my question. Thanks
You can apply function paste after sorting each row vector.
df1 <-
cbind(df1, apply(df1, 1, function(x) paste(sort(x), collapse = "")))
df1
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "ABC"
# [2,] "A" "D" "C" "ACD"
# [3,] "B" "A" "C" "ABC"
# [4,] "A" "C" "B" "ABC"
# [5,] "B" "C" "A" "ABC"
# [6,] "D" "A" "B" "ABD"
# [7,] "A" "B" "D" "ABD"
# [8,] "A" "D" "C" "ACD"
# [9,] "A" NA NA "A"
#[10,] "D" "A" NA "AD"
#[11,] "A" "D" NA "AD"
You can now simply table the column, with no need for an external package to be loaded and more complex pipes.
table(df1[, 4])
#A ABC ABD ACD AD
#1 4 2 2 2

Why a row of my matrix is a list? R

I got a list of matrices X rows by 2 columns, called list_of_matrices_by2
list_of_matrices_by2[1:3]
[[1]]
[,1] [,2]
[1,] "7204" "d"
[2,] "7204" "a"
[[2]]
[,1] [,2]
[1,] "2032" "b"
[2,] "2032" "e"
[3,] "2032" "a"
[[3]]
[,1] [,2]
[1,] "802" "d"
[2,] "802" "b"
I want to stack all my matrices in a all_pairs matrix, so I did this
all_pairs=do.call(rbind,list_of_matrices_by2)
all_pairs[1:10]
[,1] [,2]
[1,] "7204" "d"
[2,] "7204" "a"
[3,] "2032" "b"
[4,] "2032" "e"
[5,] "2032" "a"
[6,] "802" "d"
[7,] "802" "b"
[8,] "4674" "c"
[9,] "4674" "a"
[10,] "3886" "b"
class(all_pairs)
[1] "matrix"
For some reason, I need the rows of this matrix to be of class matrix. But it shouldn't be a problem since rows of matrix are matrix in R, right. But no!
all_pairs[1,]
[[1]]
[1] "7204"
[[2]]
[1] "d
So here are my questions :
1) Why this? How come a row of a matrix can possibly be a list?
2) What would you do to make it work, i.e. each row of my matrix has to be a matrix?
I am sure that your list_of_matrices_by2 is like:
x <- list(matrix(list("7204","7204","d","a"), ncol = 2),
matrix(list("2032","2032","2032","b","e","a"), ncol = 2),
matrix(list("802","802","d","b"), ncol = 2))
#[[1]]
# [,1] [,2]
#[1,] "7204" "d"
#[2,] "7204" "a"
#[[2]]
# [,1] [,2]
#[1,] "2032" "b"
#[2,] "2032" "e"
#[3,] "2032" "a"
#[[3]]
# [,1] [,2]
#[1,] "802" "d"
#[2,] "802" "b"
unlist(lapply(x, class))
# [1] "matrix" "matrix" "matrix"
unlist(lapply(x, mode))
# [1] "list" "list" "list"
So you do have a matrix, but it is not a matrix of list instead of numeric. You can perform rbind as usual:
y <- do.call(rbind, x)
# [,1] [,2]
#[1,] "7204" "d"
#[2,] "7204" "a"
#[3,] "2032" "b"
#[4,] "2032" "e"
#[5,] "2032" "a"
#[6,] "802" "d"
#[7,] "802" "b"
It has class "matrix", but still with mode "list". That is why when you extract the first row you get a list:
y[1, ]
#[[1]]
#[1] "7204"
#[[2]]
#[1] "d"
I don't know how you obtain those matrices. If you can control the generation process, it would be good that you end up with matrices of numeric. If you can not control it, you need convert them manually as below:
x <- lapply(x, function (u) matrix(unlist(u), ncol = 2))
unlist(lapply(x, mode))
# [1] "character" "character" "character"
Then you can do
do.call(rbind, x)
Related:
Why is this matrix not numeric?
How to create a matrix of lists?
Ok guys, I finally found a solution.
Reminder
all_pairs=do.call(rbind,list_of_matrices_by2)
Extraction of all values of my matrix into a vector
extracted_values=unlist(as.vector(t(all_pairs)))
Build the new matrix
all_pairs_new=t(matrix(data = extracted_values,nrow = 2,ncol = 10000))

calculate the repeatence of combinations elements in R

suppose I have two vector like this :
l1 = c('C','D','E','F')
l2 = c('G','C','D','F')
I generate all combinations of two elements using combn function:
l1_vector = t(combn(l1,2))
l2_vector = t(combn(l2,2))
> l1_vector
[,1] [,2]
[1,] "C" "D"
[2,] "C" "E"
[3,] "C" "F"
[4,] "D" "E"
[5,] "D" "F"
[6,] "E" "F"
> l2_vector
[,1] [,2]
[1,] "G" "C"
[2,] "G" "D"
[3,] "G" "F"
[4,] "C" "D"
[5,] "C" "F"
[6,] "D" "F"
Now I want to calculate the repeat elements of l1_vector and l2_vector , as the example i give, the repeat of elements should be 3 (["C","D"],["C","F"],["D","F"])
How can I do that without using loop ?
As mentioned in the comments, you can use the merge function for this. Since the default behavior of merge is to use all of the available columns, it will return only those rows that are perfect matches.
> merge(l1_vector, l2_vector)
V1 V2
1 C D
2 C F
3 D F
>
> nrow(merge(l1_vector, l2_vector))
[1] 3
While merge is perfectly fine for your case, there is some work around.
If you just need the number of repeated elements:
choose(length(intersect(l1, l2)), 2)
[1] 3
If you need the repeated elements:
t(combn(intersect(l1, l2), 2))
[,1] [,2]
[1,] "C" "D"
[2,] "C" "F"
[3,] "D" "F"

Exclude rows where element has been previously met for N times

I have following input data:
# [,1] [,2]
#[1,] "A" "B"
#[2,] "A" "C"
#[3,] "A" "D"
#[4,] "B" "C"
#[5,] "B" "D"
#[6,] "C" "D"
Next I want to exclude rows where first or second element has been previously for N times. For example if N = 2 then need to exclude following rows:
#[3,] "A" "D" - element "A" has been 2 times
#[5,] "B" "D" - element "B" has been 2 times
#[6,] "C" "D" - element "C" has been 2 times
Note: Need to take into account excluding results immediately. For example if element has met 5 times and after removing it met only 1 times then need to leave next row with this element. Because now it meets 2 times.
Example (N=2):
Input data:
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "A" "D"
[4,] "A" "E"
[5,] "B" "C"
[6,] "B" "D"
[7,] "B" "E"
[8,] "C" "D"
[9,] "C" "E"
[10,] "D" "E"
Output data:
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[5,] "B" "C"
[10,] "D" "E"
There are possibly more elegant solutions... but this seems to work:
v <- c("A", "B", "C", "D", "E")
cmb <- t(combn(v, 2))
n <- 2
# Go through each letter
for (l in v)
{
# Find the combinations using that letter
rows <- apply(cmb, 1, function(x){l %in% x})
rows.2 <- which(rows==T)
if (length(rows.2)>n)
rows.2 <- rows.2[1:n]
# Take the first n rows containing the letter,
# then append all the ones not containing it
cmb <- rbind(cmb[rows.2,], cmb[rows==F,])
}
cmb
which outputs:
[,1] [,2]
[1,] "D" "E"
[2,] "B" "C"
[3,] "A" "C"
[4,] "A" "B"

Resources