How to obtain the highest p-values of a matrix?

How to obtain the highest p-values of a matrix? - r

Under R, I developed the following script :
X = 1:3
Y = 1:2
Z = 1:4
nlargest <- function(m, n) {
res <- order(m)[seq_len(n)]
pos <- arrayInd(res, dim(m), useNames = TRUE)
values = m[res]
position = pos
list(unique(values),
unique(position))
}
m = do.call(expand.grid, lapply(list(X, Y , Z), unique))
m
ecart=as.matrix(dist(m, method = "euclidean", diag = TRUE, upper = FALSE, p = 2))
alpha=0.8
m=10
proba_matrix=alpha*exp(-ecart)
nlargest(proba_matrix, 10)
The function nlargest that I imported from another similar question doesn't extract the n-largest values of the matrix proba_matrix as expected.
I'm searching for a way to obtain n highest values of any matrix with their associated positions.
Example of expected results :
m <- matrix(seq(1,9,by=1),nrow=3,byrow=TRUE);
diag(m) <- 1;
m
[1] 1 2 3 4 5 6 7 8 9
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 1 6
[3,] 7 8 1
# I need to obtain :
nlargest(m,4)
$values
[1] 4 6 7 8
$position
row col
[1,] 2 1
[2,] 2 3
[3,] 3 1
[4,] 3 2

In a 2-D matrix you could do something like this:
library(dplyr)
nlargest <- function(m, n){
df <- as.data.frame(cbind(as.vector(m), expand.grid(1:nrow(m), 1:ncol(m))))
colnames(df) <- c("value", "row", "column")
largest <- dplyr::distinct(df, value, .keep_all = T) %>%
dplyr::slice_max(order_by = value, n = n)
l <- list(values = largest$value,
position = largest[, c("row", "column")])
return(l)
}
m <- matrix(c(9, seq(1,9,by=1)),nrow=2,byrow=TRUE)
nlargest(m, 4)
$values
[1] 9 8 7 6
$position
row column
1 1 1
2 2 4
3 2 3
4 2 2

Related

Apply mapping to matrix columns in R

How can I apply the mapping
x′ = ax + by and y′ = cx + dy
in R to a matrix with 2 columns x and y of numbers, knowing the values of a, b, c, d?

Each column x and y are vectors so you can do :
a <- 1
b <- 2
c <- 3
d <- 4
df <- data.frame(x = 1:4, y = 4:1)
xx <- a*df$x + b*df$y
yy <- c*df$x + d*df$y
xx
> 9 8 7 6
yy
> 19 18 17 16

Another option is matrix multiplication using %*%.
Using Clemsang's data:
mat <- as.matrix(data.frame(x = 1:4, y = 4:1))
A <- matrix(c(1:4), nrow = 2)
mat %*% A
# [,1] [,2]
#[1,] 9 19
#[2,] 8 18
#[3,] 7 17
#[4,] 6 16
The columns correspond to x' and y'.

Minimum Cost Distance in Matrix

I have got this matrix below
k
[,1] [,2] [,3] ,4][,5] [,6]
[1,] 1 4 9 16 25 36
[2,] 1 3 7 13 21 31
[3,] 2 2 5 10 17 26
[4,] 4 2 4 8 14 22
[5,] 7 3 3 6 11 18
[6,] 11 5 3 5 9 15
and I want to loop through starting from k[1,1] and ending at k[6,6]. My looping criteria is based on min(k[i,j+1], k[i+1,j], k[i+1, j+1]) and the answer I hope to get is something like 1+1+2+2+3+3+5+9+15 = 41 (travelling through the minimum path)
So pretty much it calculates the minimum starting from k[1,1] and then continues downwards till k[6,6]
warpingDist = function(x, y, z){
mincal = numeric(length(k))
m = nrow(k)
n = ncol(k)
i=1
j=1
mincal = which(k == min(k[i, j+1], k[i+1, j], k[i+1, j+1]), arr.ind = TRUE)
indx = data.frame(mincal)
i= indx$row
j= indx$col
if(i != m || j!=n)
{
warpingDist(k[i, j+1], k[i+1, j], k[i+1, j+1])
}
warpSum = sum(mincal)
return(warpSum)
}
value = apply(k, c(1,2), warpingDist)
value
When I run this code it displays the below:
Error: object 'value' not found
Not sure why this is happening...

As you don't provide a minimal reproducible example, I can only guess:
warpingDist = function(x, y, z, k){
# browser() # This is a good option to activate, if you run your script in RStudio
...
return(warpSum)
}
# your code
k <- whatever it is
result <- warpingDist(x, y, z, k)
I hope that helps.

Am glad, I was finally able to solve the problem...The code runs fast as well
Problem: To find the minimum cost for a matrix. For clarity, let's assume I have the matrix given below:
[1,] 1 4 6 7 8 9 0
[2,] 10 12 1 3 11 2 0
[3,] 11 12 2 8 17 1 0
[4,] 20 1 18 4 28 1 0
[5,] 5 20 80 6 9 3 0
My goal is to add the minimum path distance starting from kata[1,1] first row to the last row K[5,4]. So effectively, I want to have something like 1 + 4 + 1 + 2 + 4 + 6 + 9 + 3.
Below is the R code which I have used to implement this. It implements two functions:
# Function that calculates minimum of three values. Returns the Value.
minFUN <- function(Data, a, b){
d = (min(Data[a, b+1], Data[a+1, b], Data[a+1, b+1]))
return(d)
}
# Function that calculates the index of the minimum value, from which the
# The next iteration begins
NextRC <- function(Data, a, b){
d = min(Data[a, b+1], Data[a+1, b], Data[a+1, b+1])
if(d == Data[a, b+1]){
c = cbind(a, b+1)
}else
if(d == Data[a+1, b]){
c = cbind(a+1, b)
} else
if(d == Data[a+1, b+1]){
c = cbind(a+1, b+1)
}
return(c)
}
Je <- c()
NewRow = 1
NewCol = 1
# Warping Function that uses both functions above to loop through the dataset
WarpDist <- function(Data, a = NewRow, b = NewCol){
for(i in 1:4) {
Je[i] = minFUN(Data, a, b)
# Next Start Point
NSP = NextRC(Data, a,b)
NewRow = as.numeric(NSP[1,1])
NewCol = as.numeric(NSP[1,2])
a = NewRow
b = NewCol
}
return(Je)
}
Value=WarpDist(Data = Data, a = NewRow, b = NewCol)
warpo = Data[1,1] + sum(Value)
w = sqrt(warpo)
The result is the minimum path from the first row to the last row
Value
[1] 4 1 2 4 6
The result omits 9 and 3 because its already on the last row.
Time:
Time difference of 0.08833408 secs

R - Collapse into vector same member of a list

I have a list with same structure for every member as the following
config <- NULL
config[["secA"]] <- NULL
config[["secA"]]$VAL <- 0
config[["secA"]]$ARR <- c(1,2,3,4,5)
config[["secA"]]$DF <- data.frame(matrix(c(1,5,3,8),2,2))
config[["secB"]] <- NULL
config[["secB"]]$VAL <- 1
config[["secB"]]$ARR <- c(1,3,2,4,9)
config[["secB"]]$DF <- data.frame(matrix(c(2,6,1,9),2,2))
config[["secC"]] <- NULL
config[["secC"]]$VAL <- 5
config[["secC"]]$ARR <- c(4,2,1,5,8)
config[["secC"]]$DF <- data.frame(matrix(c(4,2,1,7),2,2))
and I need to obtain 3 vectors VAL, ARR and DF, each with the concatenated elements of the corresponding member. such as
# VAL: 0,1,5
# ARR: 1,2,3,4,5,1,3,2,4,9,4,2,1,5,8
# DF: 1,5,3,8,2,6,1,9,4,2,1,7
Looking at similar situations, I have the feeling I need to use a combination of do.call and cbind or lapply but I have no clue. any suggestions?

config <- NULL
config[["secA"]] <- NULL
config[["secA"]]$VAL <- 0
config[["secA"]]$ARR <- c(1,2,3,4,5)
config[["secA"]]$DF <- data.frame(matrix(c(1,5,3,8),2,2))
config[["secB"]] <- NULL
config[["secB"]]$VAL <- 1
config[["secB"]]$ARR <- c(1,3,2,4,9)
config[["secB"]]$DF <- data.frame(matrix(c(2,6,1,9),2,2))
config[["secC"]] <- NULL
config[["secC"]]$VAL <- 5
config[["secC"]]$ARR <- c(4,2,1,5,8)
config[["secC"]]$DF <- data.frame(matrix(c(4,2,1,7),2,2))
sapply(names(config[[1]]), function(x)
unname(unlist(sapply(config, `[`, x))), USE.NAMES = TRUE)
# $VAL
# [1] 0 1 5
#
# $ARR
# [1] 1 2 3 4 5 1 3 2 4 9 4 2 1 5 8
#
# $DF
# [1] 1 5 3 8 2 6 1 9 4 2 1 7
Or you can use this clist function
Unfortunately there were no other answers.
(l <- Reduce(clist, config))
# $VAL
# [1] 0 1 5
#
# $ARR
# [1] 1 2 3 4 5 1 3 2 4 9 4 2 1 5 8
#
# $DF
# X1 X2 X1 X2 X1 X2
# 1 1 3 2 1 4 1
# 2 5 8 6 9 2 7
It merges data frames and matrices, so you need to unlist to get the vector you want
l$DF <- unname(unlist(l$DF))
l
# $VAL
# [1] 0 1 5
#
# $ARR
# [1] 1 2 3 4 5 1 3 2 4 9 4 2 1 5 8
#
# $DF
# [1] 1 5 3 8 2 6 1 9 4 2 1 7
Function
clist <- function (x, y) {
islist <- function(x) inherits(x, 'list')
'%||%' <- function(a, b) if (!is.null(a)) a else b
get_fun <- function(x, y)
switch(class(x %||% y),
matrix = cbind,
data.frame = function(x, y)
do.call('cbind.data.frame', Filter(Negate(is.null), list(x, y))),
factor = function(...) unlist(list(...)), c)
stopifnot(islist(x), islist(y))
nn <- names(rapply(c(x, y), names, how = 'list'))
if (is.null(nn) || any(!nzchar(nn)))
stop('All non-NULL list elements should have unique names', domain = NA)
nn <- unique(c(names(x), names(y)))
z <- setNames(vector('list', length(nn)), nn)
for (ii in nn)
z[[ii]] <- if (islist(x[[ii]]) && islist(y[[ii]]))
Recall(x[[ii]], y[[ii]]) else
(get_fun(x[[ii]], y[[ii]]))(x[[ii]], y[[ii]])
z
}

Another approach, with slightly less code.
un_config <- unlist(config)
un_configNAM <- names(un_config)
vecNAM <- c("VAL", "ARR", "DF")
for(n in vecNAM){
assign(n, un_config[grepl(n, un_configNAM)])
}
This will return 3 vectors as the OP requested. However, generally it is more advantageous to store results in a list as rawr suggests. You of course can adopt the above code so that results are stored within a list.
l <- rep(list(NA), length(vecNAM))
i = 1
for(n in vecNAM){
l[[i]] <- un_config[grepl(n, un_configNAM)]
i = i +1
}

Merge all possible combinations of multiple data frames

I would like to merge by columns all the possible pair combinations of these three data frames (i.e. nine combinations)
frame1 = data.frame(a=c(1,2,3), b=c(1,2,3), c=c(1,2,3))
frame2 = data.frame(a=c(2,1,3), b=c(2,1,3), c=c(2,1,3))
frame3 = data.frame(a=c(3,2,1), b=c(3,2,1), c=c(3,2,1))
which contain the same 3 rows each but not in the same order, so I would also like that the merging be by coincidence of the pair of values of the columns a and b in the two files merged. Example:
a b c
1 1 1
2 2 2
3 3 3
+
a b c
2 2 2
1 1 1
3 3 3
=
a.x b.x c.x a.y b.y c.y
1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
I wanted then to obtain the difference between each pair of values of the columns c.x and c.y present in each merged file, in absolute values, and sum all these differences thus obtaining a "score" (of course this would be zero in this example), which I would like to add to an empty matrix 3x3 in the correspondant cell (i.e., the score of frame1 vs. frame 2 should be located in cell [2,1], etc.):
nframes = 3
frames = c(frame1,frame2,frame3)
matrix = matrix(, nrow = nframes, ncol = nframes)
matrix_scores = data.frame(matrix)
for (i in frames){
for (j in frames)
{
x = merge(i, j, by=c("a","b"))
score = sum(abs(x$c.x - x$c.y))
matrix_scores[j,i] <- score
}
}
However, when I run the loop I obtain the following message:
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Also, I understand that the line
matrix_scores[j,i] <- score
will give an error, too, but I do not know how to express that I want the score to be stored in cell [1,1], for the first iteration of the loop (frame1 vs. frame1).
The resulting matrix should be a 3x3 matrix containing all zeros:
f1 f2 f3
frame1 0 0 0
frame2 0 0 0
frame3 0 0 0

You can do:
# Put all frames in a list
d <- list(frame1, frame2, frame3)
# get all merge-combinations
gr <- expand.grid(1:length(d), 1:length(d))
# function to merge and get the sum diff:
foo <- function(i, x, gr){
tmp <- merge(x[[gr[i, 1]]], x[[gr[i, 2]]], by=c("a", "b"))
sum(abs(tmp$c.x - tmp$c.y))
}
# result matrix
matrix(sapply(1:nrow(gr), foo, d, gr), length(d), length(d), byrow = T)
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
# The scores are set as followed:
matrix(apply(gr, 1, paste, collapse="_"), 3, 3, byrow = T)
[,1] [,2] [,3]
[1,] "1_1" "2_1" "3_1"
[2,] "1_2" "2_2" "3_2"
[3,] "1_3" "2_3" "3_3"
# alternative using apply:
# function to merge and get the sum diff:
foo <- function(y, x){
tmp <- merge(x[[ y[1] ]], x[[ y[2] ]], by=c("a", "b"))
sum(abs(tmp$c.x - tmp$c.y))
}
# result matrix
matrix(apply(gr, 1, foo, d), length(d), length(d), byrow = T)

cbind: is there a way to have missing values set to NA?

Please forgive me if I missed an answer to such a simple question.
I want to use cbind() to bind two columns. One of them is a single entry shorter in length.
Can I have R supply an NA for the missing value?
The documentation discusses a deparse.level argument but this doesn't seem to be my solution.
Further, if I may be so bold, would there also be a quick way to prepend the shorter column with NA's?

Try this:
x <- c(1:5)
y <- c(4:1)
length(y) = length(x)
cbind(x,y)
x y
[1,] 1 4
[2,] 2 3
[3,] 3 2
[4,] 4 1
[5,] 5 NA
or this:
x <- c(4:1)
y <- c(1:5)
length(x) = length(y)
cbind(x,y)
x y
[1,] 4 1
[2,] 3 2
[3,] 2 3
[4,] 1 4
[5,] NA 5
I think this will do something similar to what DWin suggested and work regardless of which vector is shorter:
x <- c(4:1)
y <- c(1:5)
lengths <- max(c(length(x), length(y)))
length(x) <- lengths
length(y) <- lengths
cbind(x,y)
The code above can also be condensed to:
x <- c(4:1)
y <- c(1:5)
length(x) <- length(y) <- max(c(length(x), length(y)))
cbind(x,y)
EDIT
Here is what I came up with to address the question:
"Further, if I may be so bold, would there also be a quick way to prepend the shorter column with NA's?"
inserted into the original post by Matt O'Brien.
x <- c(4:1)
y <- c(1:5)
first <- 1 # 1 means add NA to top of shorter vector
# 0 means add NA to bottom of shorter vector
if(length(x)<length(y)) {
if(first==1) x = c(rep(NA, length(y)-length(x)),x);y=y
if(first==0) x = c(x,rep(NA, length(y)-length(x)));y=y
}
if(length(y)<length(x)) {
if(first==1) y = c(rep(NA, length(x)-length(y)),y);x=x
if(first==0) y = c(y,rep(NA, length(x)-length(y)));x=x
}
cbind(x,y)
# x y
# [1,] NA 1
# [2,] 4 2
# [3,] 3 3
# [4,] 2 4
# [5,] 1 5
Here is a function:
x <- c(4:1)
y <- c(1:5)
first <- 1 # 1 means add NA to top of shorter vector
# 0 means add NA to bottom of shorter vector
my.cbind <- function(x,y,first) {
if(length(x)<length(y)) {
if(first==1) x = c(rep(NA, length(y)-length(x)),x);y=y
if(first==0) x = c(x,rep(NA, length(y)-length(x)));y=y
}
if(length(y)<length(x)) {
if(first==1) y = c(rep(NA, length(x)-length(y)),y);x=x
if(first==0) y = c(y,rep(NA, length(x)-length(y)));x=x
}
return(cbind(x,y))
}
my.cbind(x,y,first)
my.cbind(c(1:5),c(4:1),1)
my.cbind(c(1:5),c(4:1),0)
my.cbind(c(1:4),c(5:1),1)
my.cbind(c(1:4),c(5:1),0)
my.cbind(c(1:5),c(5:1),1)
my.cbind(c(1:5),c(5:1),0)
This version allows you to cbind two vectors of different mode:
x <- c(4:1)
y <- letters[1:5]
first <- 1 # 1 means add NA to top of shorter vector
# 0 means add NA to bottom of shorter vector
my.cbind <- function(x,y,first) {
if(length(x)<length(y)) {
if(first==1) x = c(rep(NA, length(y)-length(x)),x);y=y
if(first==0) x = c(x,rep(NA, length(y)-length(x)));y=y
}
if(length(y)<length(x)) {
if(first==1) y = c(rep(NA, length(x)-length(y)),y);x=x
if(first==0) y = c(y,rep(NA, length(x)-length(y)));x=x
}
x <- as.data.frame(x)
y <- as.data.frame(y)
return(data.frame(x,y))
}
my.cbind(x,y,first)
# x y
# 1 NA a
# 2 4 b
# 3 3 c
# 4 2 d
# 5 1 e
my.cbind(c(1:5),letters[1:4],1)
my.cbind(c(1:5),letters[1:4],0)
my.cbind(c(1:4),letters[1:5],1)
my.cbind(c(1:4),letters[1:5],0)
my.cbind(c(1:5),letters[1:5],1)
my.cbind(c(1:5),letters[1:5],0)

A while back I had put together a function called Cbind that was meant to do this sort of thing. In its current form, it should be able to handle vectors, data.frames, and matrices as the input.
For now, the function is here: https://gist.github.com/mrdwab/6789277
Here is how one would use the function:
x <- 1:5
y <- letters[1:4]
z <- matrix(1:4, ncol = 2, dimnames = list(NULL, c("a", "b")))
Cbind(x, y, z)
# x y z_a z_b
# 1 1 a 1 3
# 2 2 b 2 4
# 3 3 c NA NA
# 4 4 d NA NA
# 5 5 <NA> NA NA
Cbind(x, y, z, first = FALSE)
# x y z_a z_b
# 1 1 <NA> NA NA
# 2 2 a NA NA
# 3 3 b NA NA
# 4 4 c 1 3
# 5 5 d 2 4
The two three functions required are padNA, dotnames, and Cbind, which are defined as follows:
padNA <- function (mydata, rowsneeded, first = TRUE) {
## Pads vectors, data.frames, or matrices with NA
temp1 = colnames(mydata)
rowsneeded = rowsneeded - nrow(mydata)
temp2 = setNames(
data.frame(matrix(rep(NA, length(temp1) * rowsneeded),
ncol = length(temp1))), temp1)
if (isTRUE(first)) rbind(mydata, temp2)
else rbind(temp2, mydata)
}
dotnames <- function(...) {
## Gets the names of the objects passed through ...
vnames <- as.list(substitute(list(...)))[-1L]
vnames <- unlist(lapply(vnames,deparse), FALSE, FALSE)
vnames
}
Cbind <- function(..., first = TRUE) {
## cbinds vectors, data.frames, and matrices together
Names <- dotnames(...)
datalist <- setNames(list(...), Names)
nrows <- max(sapply(datalist, function(x)
ifelse(is.null(dim(x)), length(x), nrow(x))))
datalist <- lapply(seq_along(datalist), function(x) {
z <- datalist[[x]]
if (is.null(dim(z))) {
z <- setNames(data.frame(z), Names[x])
} else {
if (is.null(colnames(z))) {
colnames(z) <- paste(Names[x], sequence(ncol(z)), sep = "_")
} else {
colnames(z) <- paste(Names[x], colnames(z), sep = "_")
}
}
padNA(z, rowsneeded = nrows, first = first)
})
do.call(cbind, datalist)
}
Part of the reason I stopped working on the function was that the gdata package already has a function called cbindX that handles cbinding data.frames and matrices with different numbers of rows. It will not work directly on vectors, so you need to convert them to data.frames first.
library(gdata)
cbindX(data.frame(x), data.frame(y), z)
# x y a b
# 1 1 a 1 3
# 2 2 b 2 4
# 3 3 c NA NA
# 4 4 d NA NA
# 5 5 <NA> NA NA

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to obtain the highest p-values of a matrix? - r

Related

Apply mapping to matrix columns in R

Minimum Cost Distance in Matrix

R - Collapse into vector same member of a list

Merge all possible combinations of multiple data frames

cbind: is there a way to have missing values set to NA?

Categories

Resources