select row lowest values using two matrices - r

If I have this matrix (mat):
set.seed(140213)
mat <- matrix(runif(16,0,1),nrow = 4)
colnames(mat) <- 1:4
rownames(mat) <- 5:8
#> mat
# 1 2 3 4
#5 0.1120015 0.01454408 0.3411633 0.3456254
#6 0.5709174 0.70443202 0.9114756 0.9157580
#7 0.1500032 0.40889119 0.6231543 0.9736331
#8 0.9773827 0.45136413 0.9706694 0.5022132
I can get the of the two lowest columns with for each row with:
namesmat <- t(apply(mat, 1, function(x)
head(names(x)[order(x, decreasing = FALSE)], 2)))
# [,1] [,2]
#5 "2" "1"
#6 "1" "2"
#7 "1" "2"
#8 "2" "4"
Now my question is:
If I have another matrix (mat2)
set.seed(14022013) ; mat2 <- matrix(runif(16,0,1),nrow = 4)
How can I get the lowest two rows for each column as I did before for mat2 but ignoring the columns I already selected from mat.
E.g. If in mat2 the highest columns for row 5 were cols 3 & 1 but would have to choose the nest highest. If the next highest was 'col 2' I would have to choose the next highest. Let me know if this is unclear.
My head wants to to some sort of paried apply using mat2 and namesmat like:
t(apply(mat, 1, function(x) head(
names(x)[order(x[! names(x) == apply(namesmat,1,c)] ,
decreasing = FALSE)], 2)))
I will be doing this over many mats so that. mat2 selection depends on the mat selection. Then I wil have a mat3 and the selection of lowest columns per row will depend on the selection of mat and mat2 cobined, so ignoring all rows already selected for mat and mat2 continuing on for several mats.
Obviously in this case there are only 4 columns so i can only do this sequence twice.

May be bit convoluted, but gets work done. I have used simpler matrices to demonstrate whats going on..
> mat1 <- matrix(c(1,2,3,4,4,3,2,1), nrow=2 , byrow=T)
> mat2 <- matrix(c(5,6,7,8,8,7,6,5), nrow=2 , byrow=T)
> colnames(mat1) <- c("A", "B", "C", "D")
> colnames(mat2) <- c("A", "B", "C", "D")
> mat1
A B C D
[1,] 1 2 3 4
[2,] 4 3 2 1
> mat2
A B C D
[1,] 5 6 7 8
[2,] 8 7 6 5
> mymax <- function(x) {
+ names( x[order(x)] )[1:2]
+ }
> mymax2 <- function(x,y) {
+ z <- names( x[order(x)] )
+ z[!z %in% y][1:2]
+ }
> namesmat <- t(apply(mat1, 1, mymax))
> namesmat
[,1] [,2]
[1,] "A" "B"
[2,] "D" "C"
> namesmat2 <- t(sapply(1:nrow(mat2), function(i) mymax2(mat2[i,], namesmat[i,])))
> namesmat2
[,1] [,2]
[1,] "C" "D"
[2,] "B" "A"

Related

Matching matrix rows with list elements in R

A reproducible data:
dat1 <- matrix(0, nrow = 9, ncol = 2)
dat1[,1] <- rep(1:3,3)
dat1[,2] <- c(1,1,1,2,2,2,3,3,3)
dat2 <- list()
dat2[[1]] <- matrix(c(1,2,1,3), nrow = 2, ncol = 2)
dat2[[2]] <- matrix(c(1,1,2,3,1,3), nrow = 3, ncol = 2 )
> dat1
[,1] [,2]
[1,] 1 1
[2,] 2 1
[3,] 3 1
[4,] 1 2
[5,] 2 2
[6,] 3 2
[7,] 1 3
[8,] 2 3
[9,] 3 3
> dat2
[[1]]
[,1] [,2]
[1,] 1 1
[2,] 2 3
[[2]]
[,1] [,2]
[1,] 1 3
[2,] 1 1
[3,] 2 3
I have a matrix (dat1) and a list (dat2).
Some rows of dat1 is same as some of the list elements of dat2. My objective is to find out the corresponding row numbers of dat1 that are matched with dat2 and store them in a list. AN EXAMPLE of the output:
> ex.result
[[1]]
[,1]
[1,] 1
[2,] 8
[[2]]
[,1]
[1,] 7
[2,] 1
[3,] 8
I am looking for a fast way to do this without using time consuming loops.
A slightly different approach:
lapply( dat2, function(m) {
apply( m, 1, function(r)
which( apply( sweep( dat1, 2, r, "==" ), 1, all ) ) ) %>% as.matrix })
Output:
[[1]]
[,1]
[1,] 1
[2,] 8
[[2]]
[,1]
[1,] 7
[2,] 1
[3,] 8
Here is an option:
lapply(dat2, function(mat)
apply(mat, 1, function(row)
match(toString(row), apply(dat1, 1, toString))))
#[[1]]
#[1] 1 8
#
#[[2]]
#[1] 7 1 8
This returns a list with integer vectors instead of a list with array/matrix entries though.
In the same vein as above, using Map() and vector recycling:
# Coercing to a data.frame to recycle the vector that is used to search:
setNames(
Map(function(x, y){
matrix(
match(
apply(y, 1, paste, collapse = ", "),
x
)
)
},
data.frame(apply(dat1, 1, paste, collapse = ", ")),
dat2),
seq_len(length(dat2)))

how can I find the all the numeric characters in a specific column in a metrix and printing them?

how can I find the all the numeric characters in a specific column in a metrix and printing them?
for example this list:
dat <- matrix(c(1,"a","b", 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,
dimnames = list(c("row1", "row2"),
c("C.1", "C.2", "C.3")))
dat
C.1 C.2 C.3
row1 "1" "a" "b"
row2 "11" "12" "13"
We can use grep.
> grep("\\d+", c(dat), value=TRUE)
[1] "1" "11" "12" "13"
If you want the location of each element in the matrix they come from, then you can use:
> num <- grep("\\d+", c(dat), value=TRUE)
> positions <- sapply(num, function(x) which(dat == x, arr.ind = TRUE))
> rownames(positions) <- c("row", "col")
> positions
1 11 12 13
row 1 2 2 2
col 1 1 2 3
It tells you number 1 is in row 1, col 1 in matrix dat. Number 11 is in row 2, col 1 in dat.
Convert it to a vector, and then remove all NAs. This works because the conversion puts non-numeric data to NA.
v <- as.numeric(dat)
v[!is.na(v)]
[1] 1 11 12 13

R Inserting a Dataframe/List into a Dataframe Element

I'd like to insert a dataframe into a dataframe element, such that if I called:df1[1,1] I would get:
[A B]
[C D]
I thought this was possible in R but perhaps I am mistaken. In a project of mine, I am essentially working with a 50x50 matrix, where I'd like each element to contain column of data containing numbers and labeled rows.
Trying to do something like df1[1,1] <- df2 yields the following warning
Warning message:
In [<-.data.frame(*tmp*, i, j, value = list(DJN.10 = c(0, 3, :
replacement element 1 has 144 rows to replace 1 rows
And calling df1[1,1] yields 0 . I've tried inserting the data in various ways, as with as.vector() and as.list() to no success.
Best,
Perhaps a matrix could work for you, like so:
x <- matrix(list(), nrow=2, ncol=3)
print(x)
# [,1] [,2] [,3]
#[1,] NULL NULL NULL
#[2,] NULL NULL NULL
x[[1,1]] <- data.frame(a=c("A","C"), b=c("B","D"))
x[[1,2]] <- data.frame(c=2:3)
x[[2,3]] <- data.frame(x=1, y=2:4)
x[[2,1]] <- list(1,2,3,5)
x[[1,3]] <- list("a","b","c","d")
x[[2,2]] <- list(1:5)
print(x)
# [,1] [,2] [,3]
#[1,] List,2 List,1 List,4
#[2,] List,4 List,1 List,2
x[[1,1]]
# a b
#1 A B
#2 C D
class(x)
#[1] "matrix"
typeof(x)
#[1] "list"
See here for details.
Each column in your data.frame can be a list. Just make sure that the list is as long as the number of rows in your data.frame.
Columns can be added using the standard $ notation.
Example:
x <- data.frame(matrix(NA, nrow=2, ncol=3))
x$X1 <- I(list(data.frame(a=c("A","C"), b=c("B","D")), matrix(1:10, ncol = 5)))
x$X2 <- I(list(data.frame(c = 2:3), list(1, 2, 3, 4)))
x$X3 <- I(list(list("a", "b", "c"), 1:5))
x
# X1 X2 X3
# 1 1:2, 1:2 2:3 a, b, c
# 2 1, 2, 3,.... 1, 2, 3, 4 1, 2, 3,....
x[1, 1]
# [[1]]
# a b
# 1 A B
# 2 C D
#
x[2, 1]
# [[1]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 3 5 7 9
# [2,] 2 4 6 8 10

How do I get which.max to return the row name and not an index number

After running correlations I need to ID the row of the maximum value in each column. I am using which.max but I can not get the row name. Instead I get an index number which is worthless. Each row has a name.
apply(my.data,2,which.max)
# create example data
set.seed(1)
df <- data.frame(col1=runif(100), col2=runif(100))
row.names(df) <- paste0("row", 1:100)
# get max
rownames(df[apply(df, 2, which.max), ])
# [1] "row18" "row4"
The result from running correlations should be a matrix, so here's an example using a matrix.
> M <- matrix(c(1,5,3,17,6,8,9,2,3,10,8,4), 4, 3)
> rownames(M) <- letters[1:4]
> M
## [,1] [,2] [,3]
## a 1 6 3
## b 5 8 10
## c 3 9 8
## d 17 2 4
> rownames(M)[apply(M, 2, which.max)]
## [1] "d" "c" "b"

Named rows and cols for matrices in R

Is it possible to have named rows and columns in Matrices?
for example:
[,a] [,b]
[a,] 1 , 2
[b,] 3 , 4
Is it even reasonable to have such a thing for exploring the data?
Sure. Use dimnames:
> a <- matrix(1:4, nrow = 2)
> a
[,1] [,2]
[1,] 1 3
[2,] 2 4
> dimnames(a) <- list(c("A", "B"), c("AA", "BB"))
> a
AA BB
A 1 3
B 2 4
With dimnames, you can provide a list of (first) rownames and (second) colnames for your matrix. Alternatively, you can specify rownames(x) <- whatever and colnames(x) <- whatever.

Resources