Variable number of Inputs for the Function - R - r

I have the following function that A and B indicate vector as inputs. My question is how I can have different number of vetors for this function. In my function, n is fixed and shows the number of samples. For example, I need that my function works for this case as well: sample_sum (A, B, C, D, E, n, ...).
Also for the part that I get samples of vector A and B (i.e. in linesample_A <- qss(A, n=n, ...)) how I can modify it for different number of vectors not using for loop since for loop is not fast enough.
Thanks
sample_sum <- function(A, B, n, ...)
{
qss <- function(X, n, ...)
{
r_X <- range(X)
dens_X <- density(X, ...)
sample(dens_X$x, size=n, prob=dens_X$y, replace=TRUE)
}
sample_A <- qss(A, n=n, ...)
sample_B <- qss(B, n=n, ...)
sample_A + sample_B
}

I think what you need is the following:
new_sample_sum <- function(my_vector_list, n, ...)
{
qss <- function(X, n, ...)
{
r_X <- range(X)
dens_X <- density(X, ...)
sample(dens_X$x, size=n, prob=dens_X$y, replace=TRUE)
}
rowSums(sapply(my_vector_list, qss, n = n))
}
You can put multiple vectors in your vector list and then do your calculations over the list without worrying about the number of vectors. Just a brief note on the calculations, sapply will loop over all vectors in the list and then return a table with all the results of applying qss to each vector (For every vector the result is one column in the table that sapply returns). Since you add the vectors in the final step of your function I do a rowSums of the table with all the results.
And in order to prove the consistency:
set.seed(1)
x <- c(1,2,3,4)
y <- c(6,7,9,0)
z <- c(2,2,2,2)
> sample_sum(x, y, n=2)
[1] 6.545129 13.474390
> new_sample_sum(list(x, y), n = 2)
[1] 6.545129 13.474390
Or with more vectors:
sample_sum <- function(A, B, C, n, ...)
{
qss <- function(X, n, ...)
{
r_X <- range(X)
dens_X <- density(X, ...)
sample(dens_X$x, size=n, prob=dens_X$y, replace=TRUE)
}
sample_A <- qss(A, n=n, ...)
sample_B <- qss(B, n=n, ...)
sample_C <- qss(C, n=n, ...)
sample_A + sample_B + sample_C
}
set.seed(1)
> sample_sum(x, y, z, n = 2)
[1] 6.102482 15.450364
set.seed(1)
> new_sample_sum(list(x, y, z), n = 2)
[1] 6.102482 15.450364

Just use n=length(A) to dynamically determine the size of the vector arguments.

Related

How to write this function to have as output a vector when the input is a vector?

Hello,
I want to have in output a vector when the entry is vector without using a function of apply's family. How should I write my function?
Thanks.
I used this code where I was forced to use two functions.
f1=function(l){
y= B1 # vector of length N
li= position # a vector of length N
h=10
a=(li-l)/h
Knorm=dnorm(a)
b=Knorm*y
num=sum(b)
den=sum(Knorm)
num/den
}
########## Forme vectorielle
f2 = function(l){
sapply(l,f1)
}
L=seq(10000000,11000000,by=1)
f2(L)
If I compute f1(L) I will get one value. That's why I was forced to write a second function to apply my first function to each element of vector L.
The purpose is to write it in one function.
Use outer and colSums to allow the function to take l as a vector:
f <- function(l){
y <- B1 # vector of length N
li <- position # a vector of length N
h <- 10
a <- outer(li, l, "-")/h
Knorm <- dnorm(a)
b <- Knorm*y
num <- colSums(b)
den <- colSums(Knorm)
num/den
}
And here is a simpler equivalent function:
f <- function(l){
Knorm <- dnorm(outer(position, l, "-")/10)
colSums(Knorm*B1)/colSums(Knorm)
}
Compare to OP's function:
f1=function(l){
y= B1 # vector of length N
li= position # a vector of length N
h=10
a=(li-l)/h
Knorm=dnorm(a)
b=Knorm*y
num=sum(b)
den=sum(Knorm)
num/den
}
position <- 10:1
B1 <- 1:10
sapply(8:12, f1)
#> [1] 5.300480 5.220937 5.141656 5.062713 4.984177
f(8:12)
#> [1] 5.300480 5.220937 5.141656 5.062713 4.984177
UPDATE
Based on the comments, something like this may work best for the large vectors involved:
library(parallel)
f1 <- function(l) {
dkAll <- abs(outer(position, l, "-"))
Knorm <- dnorm(outer(position, l, "-")/pmax(dkAll[order(col(dkAll), dkAll)[seq(70, by = length(position), length.out = length(l))]], 1000))
colSums(Knorm*y)/colSums(Knorm)
}
y <- seq(1, 100, length.out = 23710)
position <- seq(10351673, 12422082, length.out=23710)
l <- seq(11190000, 11460000, by=10)
# ysmoothed <- f1(l) # memory allocation error
cl <- makeCluster(detectCores())
clusterExport(cl, list("y", "position", "l", 'f1'))
system.time(ysmoothed <- parLapply(cl, l, f1))
#> user system elapsed
#> 0.02 0.00 20.13
Created on 2022-02-02 by the reprex package (v2.0.1)

Making custom functions in R involving summation

I am a novice in R asked to compute for a descriptive statistic called dominance (D; expressed as a percentage). D is basically defined as the mean abundance (MA) value of x divided by the sum of MA values of x to i. MA meanwhile is defined as the sum of all values in a vector over the length of the said vector. Here is how I normally approach things:
#Example data
x <- c(1, 2, 3)
y <- c(4, 5, 6)
z <- c(7, 8, 9)
#Mean abundance function
mean.abundance <- function(x){
N_sum <- sum(x)
N_count <- length(x)
N_sum/N_count
}
#Percent dominance function (workaround)
percent.dominance <- function(x, ...){
MA_a <- (x)
sum_MA_i <- sum(x, ...)
(MA_a/sum_MA_i)*100
}
MA_x <- mean.abundance(x)
MA_y <- mean.abundance(y)
MA_z <- mean.abundance(z)
MA <- c(MA_x, MA_y, MA_z)
MA
D_x <- percent.dominance(MA_x, MA_y, MA_z)
D_y <- percent.dominance(MA_y, MA_x, MA_z)
D_z <- percent.dominance(MA_z, MA_x, MA_y)
D <- c(D_x, D_y, D_z)
D
That approach alone already gives me the %D values I am looking for. My problem is that my (perfectionist) PI is asking me to compute for the %D values directly using vectors x, y, and z (and not stepwise by means of calculating MA values then using vectors MA_x, MA_y, and MA_z to calculate for %D). I am stumped making a custom function for %D that involves vectors containing raw data; here is a failed attempt to revise said custom function, just to give a general idea.
#Percent dominance function (incorrect)
percent.dominance <- function(x, ...){
MA_a <- sum(x)/length(x)
sum_MA_i <- sum(x, ...)/length(x, ...)
(MA_a/sum_MA_i)*100
}
You can capture the optional data passed with list(...) and make the following changes to the function -
percent.dominance <- function(x, ...){
data <- list(...)
MA_a <- sum(x)/length(x)
sum_MA_i <- sum(x, unlist(data))/(length(data) + 1)
(MA_a/sum_MA_i)*100
}
percent.dominance(x, y, z)
#[1] 13.33333
percent.dominance(y, x, z)
#[1] 33.33333
percent.dominance(z, x, y)
#[1] 53.33333

how can i get list of all partition with K dimension R

i'm using this code:
library("partitions")
x <- c(2,4,6)
parts <- listParts(length(x))
out <- rapply(parts, function(ii) x[ii], how="replace")
to calculate list vector of all partition, but i would be like list of partition with k dimension, for example:
k=2
{(2),(4,6)}{(4),(2,6)}{(6),(2,4)}
Maybe there are better ways of doing this but the following does what you want.
library(partitions)
funParts <- function(x, k){
parts <- listParts(length(x))
res <- lapply(parts, function(inx) sapply(inx, function(i) x[i]))
res <- unlist(res, recursive = FALSE)
res <- res[sapply(res, length) <= k]
unique(res)
}
x <- c(2,4,6)
k <- 2
funParts(x, 2)
funParts(x, 1)
funParts(4:10, 3)

R: populating an array from lists

I have a function f(x, y) that returns a list of 8 logical vectors, where x and y are integers. I want to populate a three-dimensional array M so that M[x, y, z] is the number of TRUEs in the zth element of f(x, y). I can do this with nested for loops, but I know those are frowned upon in R. I think there's a more elegant way, using either outer or rbind and sapply but I can't figure it out. Here's my code with the nested for loops:
M <- array(dim=c(150, 200, 8))
for(j in 1:150) {
for(k in 1:200) {
rsu <- f(j, k)
for(z in 1:8) {
M[j, k, z] <- sum(rsu[[z]])
}}}
What is a more efficient/elegant way of populating this array that gives the same result?
Edited to add: For purposes of this question, treat f as a black box. In reality it involves various calculations and lookups about eight different satellites, but here's a dummy function that will generate some data for this example:
is.prime <- function(n) n == 2L || all(n %% 2L:ceiling(sqrt(n)) != 0)
#source for is.prime function:
# https://stackoverflow.com/questions/19767408/prime-number-function-in-r
f <- function(x,y) {
retlist <- list()
retlist[[1]] <- c(FALSE, FALSE, rep(TRUE, x))
retlist[[2]] <- c(TRUE, TRUE, rep(FALSE, y), rep(TRUE, y))
retlist[[3]] <- c(is.prime(x), is.prime(y), is.prime(x+y), is.prime(x+y+3), sapply(x:(2*(x+y)), is.prime))
retlist[[4]] <- c(x+y %% 5 == 0, x*y %% 6 ==0)
retlist[[5]] <- retlist[[(x+y) %% 4 + 1]]
retlist[[6]] <- retlist[[y %% 4 + 1]]
retlist[[7]] <- retlist[[x %% 6 + 1]]
retlist[[8]] <- sapply(abs(x-y):(7L*x+y+1), is.prime)
return(retlist)
}
Here's how to the populate the array, giving the same results, using nested functions and sapply instead of for:
f2 <- function(x,y) {
rsu <- f(x,y)
values <- vapply(1:8, FUN=function(z) sum(rsu[[z]]), FUN.VALUE=1L)
}
f3 <- function(x) array(data=t(sapply(1:200, FUN=function(w) f2(x,w))), dim=c(1,200,8))
M2 <- array(data=t(sapply(1:150, FUN=f3)), dim=c(150,200,8))
Here's how to do it with outer. But it's unintuitive; the matrix data are assigned within the function. I don't understand why I need to invoke Vectorize(f2) here instead of simply f2.
M2 <- array(dim=c(150, 200, 8))
f2 <- function(x, y) {
rsu <- f(x, y)
M2[x, y, ] <<- vapply(1:8, FUN=function(z) sum(rsu[[z]]), FUN.VALUE=1L)
return(0L)
}
ABC <- outer(1:150, 1:200, Vectorize(f2))

Error in seq.default(a, b, by = h) : 'to' cannot be NA, NaN or infinite

I have a two functions one is to calculate the integration and another one is fixed-point method to find the root.This is the function to calculate the integration:
trapezoid <- function(fun, a, b, n=100) {
h <- (b-a)/n
x <- seq(a, b, by=h)
y <- fun(x)
s <- h * (y[1]/2 + sum(y[2:n]) + y[n+1]/2)
return(s)
}
And this is the root finding function:
fixedpoint <- function(fun, x0, tol=1e-03, niter=5000){
## fixed-point algorithm to find x such that fun(x) == x
## assume that fun is a function of a single variable
## x0 is the initial guess at the fixed point
xold <- x0
xnew <- fun(xold)
for (i in 1:niter) {
xold <- xnew
xnew <- fun(xold)
if ( abs((xnew-xold)) < tol )
return(xnew)
}
stop("exceeded allowed number of iterations")
}
Now I define a function f f<-function(x) {x^2}
And get its integration function h<-function(x) trapezoid(f,2,x)
Last I want to find the roots of h by doingfixedpoint(h,2)
But I got the error message like this:
Error in seq.default(a, b, by = h) : 'to' cannot be NA, NaN or
infinite

Resources