Take the first unique value form a function - r

This is my function:
g <- function(x,y){
x <- (x-y):x
y <- 1:30 # ------> (y is always fixed 1:30)
z<- outer(x,y,fv) # ---->(fv is a previous function)
s <- colSums(z)
which(s==max(s),arr.ind=T)
}
It tells me the position of the max value in s. I basically have a problem in choosing y because given a small y, the max(s) appears more than once in s. For example:
#given x=53
> g(53,1)
[1] 13 16 20 22 25 26 27
> g(53,2)
[1] 20 25 26
> g(53,3)
[1] 20 25 26
> g(53,4)
[1] 20 25 26
> g(53,5)
[1] 20 25
> g(53,6)
[1] 25 -----> This is the only result i would like from my function (right y=6)
Another example:
# given x=71
> g(71,1)
[1] 7 9 14
> g(71,2)
[1] 7 14
> g(71,3)
[1] 14 -----> my desired result (right y=3)
Therefore, i would like a function resulting in the first unique solution given y as small as possible ( ex: g(53)=25 , g(71)=14, ...). Any help? Thanks
This is a simplify example. I hope to be more clear in questioning:
#The idea is the same:
n <- 1:9
e <- rep(nn,500)
p<- sample(e) # --->(Need to sample in order to have more max later (mixed matrix)
mat <- matrix(p,90)
g <- function(x,y){
x <- (x-y):x
k <- rowSums(mat[,x])
which(k==max(k), arr.ind=T)
}
#In my sample matrix :
k <- rowSums(mat[,44:45])
which(k==max(k), arr.ind=T)
[1] 44 71 90
#In fact
g(45,1)
[1] 44 71 90 # ---> more than one solution
g(45,2)
[1] 90 # ----> I would like to pick up this value wich is the first unique solution given x=45
Therefore, i would like a function resulting in the first unique solution for y as small as possible given x ( in this new ex: g(45)=90... ).

I got it. It is a bit long but i think right.
Taking into consideration the second simplify example:
g <- function(x,y){
x <- (x-y):x
k <- rowSums(mat[,x])
q <- which(k==max(k), arr.ind=T)
length(q)
}
gv <- Vectorize(g)
l <- function(x){
y<- 1:30 # <- (until 30 to be sure)
z<- outer(x,y,gv)
y <- which.min(z) # <- (min is surely length=1 and which.min takes the first)
x <- (x-y):x
k <- rowSums(mat[,x])
q <- which(k==max(k), arr.ind=T)
q
}
l(45)
[1] 90

It seems like you could just do this with a recursive function. Consider the following:
set.seed(42)
n = 1:9
e = rep(n, 500)
p = sample(e)
mat = matrix(p, 90)
g <- function(x, y=1) {
xv <- (x-y):x
k <- rowSums(mat[, xv])
i <- which(k == max(k), arr.ind=T)
n <- length(i)
if (n == 1) {
return(y) # want to know the min y that solves the problem, right?
} else {
y <- y + 1 # increase y by 1
g(x,y) # run our function again with a new value of y
}
}
You should now be able to run g(45) and get 1 as the result, since that is the value of y that solves the problem, and g(33) to get 2.

Related

Saving quantities is a for loop

I am having problems when saving the results in a for loop.
I am computing a variance (this is not relevant I think) and my code is:
library(dirmult)
n <- 50
p <- 20
size <- 5*p
prob_true <- rep(1/p, p)
multinom <- as.matrix(rmultinom(n, size, prob = prob_true))
zeros <- round(0.5*p*n)
a <- c(as.matrix(multinom))
a[sample(1:(p*n), zeros)] <- 0
data_zeros <- matrix(a, p, n)
dirmult <- dirmult(t(data_zeros))
alpha <- dirmult$gamma
sum_alpha <- (1-dirmult$theta)/dirmult$theta
for (j in ncol(data_zeros)){
A <- alpha/sum_alpha
B <- 1 - A
N <- colSums(data_zeros)
C <- 1 + sum_alpha
var_s_dirm <- list()
var_s_dirm[[j]] <- N[j]*A*B*((N[j]+sum_alpha)/C)
}
In particular I can say that alpha is a vector with 20 values, sum_alpha is a scalar data_zeros is my dataset which has 20 rows and 50 columns and N is the sum of each column of the dataset, so it is a vector with 50 values.
It seems very simple to do what I wanted to do:
I want to get a list with 50 vectors where each one differs form the other by the fact that I multiply for a different value of N.
I really hope that somebody can help me finding the error.
The problem is (probably) you are setting constants in each time j is increased, and in each step you clear the list with the line var_s_dirm <- list()...
See if this works for you
library(dirmult)
n <- 50
p <- 20
size <- 5*p
prob_true <- rep(1/p, p)
multinom <- as.matrix(rmultinom(n, size, prob = prob_true))
zeros <- round(0.5*p*n)
a <- c(as.matrix(multinom))
a[sample(1:(p*n), zeros)] <- 0
data_zeros <- matrix(a, p, n)
dirmult <- dirmult(t(data_zeros))
alpha <- dirmult$gamma
sum_alpha <- (1-dirmult$theta)/dirmult$theta
A <- alpha/sum_alpha
B <- 1 - A
N <- colSums(data_zeros)
C <- 1 + sum_alpha
var_s_dirm <- list()
for (j in 1:ncol(data_zeros)){
var_s_dirm[[j]] <- N[j]*A*B*((N[j]+sum_alpha)/C)
}
output
var_s_dirm
[[1]]
[1] 2.614833 2.327105 2.500483 3.047700 2.233528 2.130223 2.700103 2.869699 2.930213 2.575903 2.198459 2.846096
[13] 2.425448 3.517559 3.136266 2.565345 2.578267 2.763113 2.709707 3.420792
[[2]]
[1] 2.568959 2.286279 2.456615 2.994231 2.194343 2.092850 2.652732 2.819353 2.878806 2.530712 2.159889 2.796165
[13] 2.382897 3.455848 3.081244 2.520339 2.533034 2.714637 2.662168 3.360778
[[3]]
[1] 3.211199 2.857849 3.070769 3.742790 2.742930 2.616064 3.315916 3.524193 3.598509 3.163391 2.699862 3.495207
[13] 2.978622 4.319811 3.851556 3.150424 3.166294 3.393297 3.327711 4.200974
....

Correlation with sliding window

I want to compute a pearson correlation between two vectors (each four elements) with a slinding window (window=1) and keep the best result:
list1 <- read.table(text= "20
34
89
35")
list2 <- read.table(text= "22
99
313
13
71
200")
The comparison will be a loop on:
cor(x=c(20,34,89,35),y=c(22,99,313,13), method = "pearson")
cor(x=c(20,34,89,35),y=c(99,313,13,71), method = "pearson")
cor(x=c(20,34,89,35),y=c(313,13,71,200), method = "pearson")
The result will contain the score and the vectors that give the highest correlation score. In this case it will be: x=c(20,34,89,35) and y=c(22,99,313,13) and 0.9588095.
Using rollapply compute the correlations, find the index of the largest one and derive y and its correlation with x from that.
library(zoo)
x <- list1$V1
w <- length(x)
ix <- which.max(rollapply(list2$V1, w, cor, x))
y <- list2$V1[seq(ix, length = w)]
y
## [1] 22 99 313 13
cor(x, y)
## [1] 0.9588095
A variation of the above is to return the correlation and the y vector from rollapply:
r <- rollapply(list2$V1, length(x), function(y) c(cor(x, y), y))
ix <- which.max(r[, 1])
r[ix, 1]
## [1] 0.9588095
r[ix, -1]
## [1] 22 99 313 13
An R base solution
out <- list(NULL)
j <- 1
ind <- 0
while(ind[length(ind)]<length(list2$V1)){
ind <- j:(j+3);
out[[j]] <- list(Vector1=list1$V1,
Vector2=list2$V1[ind],
Cor=cor(list1$V1, list2$V1[ind]));
out
j <- j+1
}
out[[which.max(unlist(sapply(out, "[", "Cor")))]]
which produces:
$Vector1
[1] 20 34 89 35
$Vector2
[1] 22 99 313 13
$Cor
[1] 0.9588095

Functions with loops & multiples in R

I'm still getting to grips with R and have been set the task of specifically writing a function where if x and y are vectors:
x <- c(3,7,9)
y <- 20
...then all of x and multiples of x which are less than y need to be output in the form of a vector, e.g.:
v1 <- c(3,6,7,9,12,14,15,18)
But then within the function it needs to sum up all the numbers in the vector v1 - (3+6+...+15+18).
I've had a go at it but I can never really get my head around if else statements, so could anyone help me out and explain so I know for future reference?
No loops needed. Figure out how many times each x value goes into y, then generate a list of the unique numbers:
x <- c(3,7,9)
y <- 20
possible <- y %/% x
#[1] 6 2 2
out <- unique(sequence(possible) * rep(x,possible))
# or alternatively
# out <- unique(unlist(Map(function(a,b) sequence(a) * b, possible, x)))
out
#[1] 3 6 9 12 15 18 7 14
sum(out)
#[1] 84
Here's an example using basic loops and if else branching in R.
x <- c(3,7,9)
y1 <- 20
v1 <- numeric()
for(i in x){
nex <- i
counter <- 1
repeat{
if(!(nex %in% v1)){
v1 <- c(v1, nex)
}
counter <- counter + 1
nex <- i*counter
if(nex >= y1){
break
}
}
}
v1 <- sort(v1)
v1.sum <- sum(v1)
v1
## 3 6 7 9 12 14 15 18
v1.sum
## 84

Calculate derivative and use the result as a function

I need to calculate a derivative, say f = x^2. I am using the code
D(expression(x^2), 'x')
How do I use the output of D(x^2) = 2x as a function that will take values?
k<-deriv(~ x^2, "x")
x <- -1:10
eval(k)
Here is a way to convert the results to a function:
> myfun <- function(x) {}
> body(myfun) <- D( expression(x^2), 'x' )
>
> myfun( 1:10 )
[1] 2 4 6 8 10 12 14 16 18 20

vectorize head(which(t > x), n=1) for many values of x

I have a situation similar to the following in R:
t <- (1:100) * 15
x <- c(134, 552, 864, 5000)
And I want to find for each value in x what the first index in t where t > x is. The following works using a loop:
y <- numeric(length(x))
for (i in 1:length(x))
y[i] <- which(t > x[i])[1]
# y
# [1] 9 37 58 NA
I was taught that loops in R are 'bad and slow', and while the time this takes to run for a reasonably large x is not a deal-breaker, I would like to know whether there is a better way?
If the objects are not too big (so that RAM is not limiting), you don't need *apply functions, which are just hidden loops.
temp <- outer(x,t,'<')
y <- length(t) - (rowSums(temp)-1)
y[y>length(t)] <- NA
#[1] 9 37 58 NA
fun <- function(x){
which(t > x)[1]
}
R > sapply(x, fun)
[1] 9 37 58 NA
Almost the same:
require(functional)
apply(matrix(t > rep(x, each=length(t)), length(t)), 2, Compose(which, Curry(append, Inf), min))
## [1] 9 37 58 Inf

Resources