Having a function output two quantities in looping structure in R? - r

In my R function below, I'm wondering how I can change my code such that I can get pe out of my fun function? Right now, fun only outputs L and U.
P.S. Of course, I want to keep the function work as it does right now, so therefore replicate may also need to change as a result of having fun output pe in addition to L and U.
CI.bi = function(n, p, n.sim){
fun <- function(n1 = n, p1 = p){
x <- rbinom(1, size = n1, prob = p1)
pe <- x/n1
res <- binom.test(x, n1, p1)[[4]]
c(L = res[1], U = res[2])
}
sim <- t(replicate(n.sim, fun()))
y = unlist(lapply(1:n.sim, function(x) c(x, x)))
plot(sim, y, ty = "n", ylab = NA, yaxt = "n")
segments(sim[ ,1], 1:n.sim, sim[ ,2], 1:n.sim, lend = 1)
}
# Example of use:
CI.bi(n = 15, p = .5, n.sim = 3)

You can have fun() return pe as an additional element of the return vector.
When referencing sim later on, just specify which columns you want to use. I believe the below code sample replicates your current functionality but has pe as an additional output of fun()
CI.bi = function(n, p, n.sim){
fun <- function(n1 = n, p1 = p){
x <- rbinom(1, size = n1, prob = p1)
pe <- x/n1
res <- binom.test(x, n1, p1)[[4]]
c(L = res[1], U = res[2], pe=pe)
}
sim <- t(replicate(n.sim, fun()))
y = unlist(lapply(1:n.sim, function(x) c(x, x)))
plot(sim[,1:2], y, ty = "n", ylab = NA, yaxt = "n")
segments(sim[ ,1], 1:n.sim, sim[ ,2], 1:n.sim, lend = 1)
}
CI.bi(n = 15, p = .5, n.sim = 3)

Related

Saving x, and y from `curve()` in a for loop R?

I have a function called all.priors (see R code below). My goal is to get the x and y from the curve() call inside the for loop, and save these xs and ys as object h.
(I want to have 101 rows, and 2*length(d) columns in h. This way, each 2 columns, contain x and y from a curve() run in the for loop.)
Question:
how can I correctly save the xs and ys from the curve() call? [I get the error: incorrect number of subscripts on matrix]
all.priors = function(a, b, lo, hi, d, Bi = 55, n = 1e2){
h = matrix(NA, 101, 2*length(d))
for(i in 1:length(d)){
p = function(x) get(d[i])(x, a, b)
prior = function(x) p(x)/integrate(p, lo, hi)[[1]]
likelihood = function(x) dbinom(Bi, n, x)
posterior = function(x) prior(x)*likelihood(x)
h[i,] = curve(posterior, ty = "n", ann = FALSE, yaxt = "n", xaxt = "n", add = i!= 1, bty = "n")
}
}
#Example of use:
all.priors(lo = 0, hi = 1, a = 2, b = 3, d = c("dgamma", "dnorm", "dcauchy", "dlogis"))
You just need to carefully place the values in the matrix, and then return the matrix from your function. try this
all.priors = function(a, b, lo, hi, d, Bi = 55, n = 1e2){
h = matrix(NA, 101, 2*length(d))
for(i in 1:length(d)){
p = function(x) get(d[i])(x, a, b)
prior = function(x) p(x)/integrate(p, lo, hi)[[1]]
likelihood = function(x) dbinom(Bi, n, x)
posterior = function(x) prior(x)*likelihood(x)
cv <- curve(posterior, ty = "n", ann = FALSE, yaxt = "n", xaxt = "n", add = i!= 1, bty = "n")
h[,i*2-1] <- cv$x
h[,i*2] <- cv$y
}
h
}
all.priors(lo = 0, hi = 1, a = 2, b = 3, d = c("dgamma", "dnorm", "dcauchy", "dlogis"))
A different way to solve this might be to save the answers in a list rather than a matrix. I think that your function complicates the picture of what is going on, so I will use a simpler example.
h = list()
for(i in 1:5) {
h[i] = list(curve(sin(i*x), xlim=c(0,6.3))) }
The resulting data structure should be easy to use.

get(paste0(...)) overwrites results in R subarray

I have two subarrays and would like to sample one of them at random. The subarrays consist of character labels from a larger pool of labels as follows:
K <- as.character(1:10)
Suppose I call the arrays K1 and K2 and assign 10 labels between them as follows:
K1 <- 1:8
K2 <- 9:10
I then use
get(paste0("K", i))
to retrieve the labels that were sampled using a 'for' loop.
The issue I am having is this: if K1 is sampled, get(paste0("K", i)) returns labels from K2 (which contain labels 9:10) instead of returning K1 as needed.
In other words, I believe get(past0("K", i)) may be overwriting results.
Any thoughts on why this may occur and how to go about circumventing the issue?
Below is a more extensive example:
K <- 2
N <- 100
Hstar <- 10
perms <- 10000
probs <- rep(1/Hstar, Hstar)
specs <- 1:N
pop <- array(dim = c(c(perms, N), K))
haps <- as.character(1:Hstar)
K1 <- 1:8 # subarray 1
K2 <- 9:10 # subarray 2
for(j in 1:perms){
for(i in 1:K){
if(i == 1){
pop[j, specs, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
}
else{
pop[j ,, 1] <- sample(haps[K1], size = N, replace = TRUE, prob = probs[K1])
pop[j ,, 2] <- sample(haps[K2], size = N, replace = TRUE, prob = probs[K1])
}
}
}
HAC.mat <- array(dim = c(c(perms, N), K))
for(k in specs){
for(j in 1:perms){
for(i in 1:K){
ind.index <- sample(specs, size = k, replace = FALSE)
hap.plot <- pop[sample(1:nrow(pop), size = 1, replace = TRUE), ind.index, sample(i, size = 1, replace = TRUE)]
HAC.mat[j, k, i] <- length(unique(hap.plot))
}
}
}
means <- apply(HAC.mat, MARGIN = 2, mean)
lower <- apply(HAC.mat, MARGIN = 2, function(x) quantile(x, 0.025))
upper <- apply(HAC.mat, MARGIN = 2, function(x) quantile(x, 0.975))
d <- data.frame(specs, means, lower, upper)
par(mfrow = c(1, 2))
if(i == 1){
plot(specs, means, type = "n", xlab = "Specimens sampled", ylab = "Unique haplotypes", ylim = c(1, Hstar))
polygon(x = c(specs, rev(specs)), y = c(lower, rev(upper)), col = "gray")
lines(specs, means, lwd = 2)
HAC.bar <- barplot(N*probs, xlab = "Unique haplotypes", ylab = "Specimens sampled", names.arg = 1:Hstar)
}
else if(i > 1){
plot(specs, means, type = "n", xlab = "Specimens sampled", ylab = "Unique haplotypes", ylim = c(1, Hstar))
polygon(x = c(specs, rev(specs)), y = c(lower, rev(upper)), col = "gray")
lines(specs, means, lwd = 2)
HAC.bar <- barplot(N*probs[get(paste0("K", i))], xlab = "Unique haplotypes", ylab = "Specimens sampled", names.arg = get(paste0("K", i))) ## The issue may lie here
}
Any advice on what may be going on here is greatly appreciated.

Vectorized R function to produce sets of histograms

I have a vectorized R function (see below). At each run, the function plots two histograms. My goal is that when argument n is a vector (see example of use below), the function plots length of n separate sets of these histograms (ex: if n is a vector of length 2, I expected two sets of histograms i.e., 4 individual histograms)?
I have tried the following with no success. Is there a way to do this?
t.sim = Vectorize(function(n, es, n.sim){
d = numeric(n.sim)
p = numeric(n.sim)
for(i in 1:n.sim){
N = sqrt((n^2)/(2*n))
x = rnorm(n, es, 1)
y = rnorm(n, 0, 1)
a = t.test(x, y, var.equal = TRUE)
d[i] = a[[1]]/N
p[i] = a[[3]]
}
par(mfcol = c(2, length(n)))
hist(p) ; hist(d)
}, "n")
# Example of use:
t.sim(n = c(30, 300), es = .1, n.sim = 1e3) # `n` is a vector of `2` so I expect
# 4 histograms in my graphical device
Vectorize seems to be based on mapply, which would essentially call the function numerous times while cycle through your inputs vector. Hence, the easier way out probably just calls it outside the function
t.sim = Vectorize(function(n, es, n.sim){
d = numeric(n.sim)
p = numeric(n.sim)
for(i in 1:n.sim){
N = sqrt((n^2)/(2*n))
x = rnorm(n, es, 1)
y = rnorm(n, 0, 1)
a = t.test(x, y, var.equal = TRUE)
d[i] = a[[1]]/N
p[i] = a[[3]]
}
# par(mfcol = c(2, npar))
hist(p) ; hist(d)
}, "n")
#inputs
data <- c(30,300)
par(mfcol = c(2, length(data)))
t.sim(n = data, es = c(.1), n.sim = 1e3)

Unused arguments error in R studio

I get an error when I try to run this line of code:
nnetPred.model <- nnetPred(X, Y, step_size = 0.4,reg = 0.0002, h=50, niteration = 6000)
The error message is:
Error in nnetPred(X, Y, step_size = 0.4, reg = 2e-04, h = 50, niteration = 6000) :
unused arguments (step_size = 0.4, reg = 2e-04, h = 50, niteration = 6000)
My code is as below:
nnetPred <- function(X, Y, para = list()){
W <- para[[1]]
b <- para[[2]]
W2 <- para[[3]]
b2 <- para[[4]]
N <- nrow(X)
hidden_layer <- pmax(0, X%*% W + matrix(rep(b,N), nrow = N, byrow = T))
hidden_layer <- matrix(hidden_layer, nrow = N)
scores <- hidden_layer%*%W2 + matrix(rep(b2,N), nrow = N, byrow = T)
predicted_class <- apply(scores, 1, which.max)
return(predicted_class)
}
nnetPred.model <- nnetPred(X, Y, step_size = 0.4,reg = 0.0002, h=50, niteration = 6000)
It looks like you are trying to use variable arguments. In R, this means the ellipsis (...). This is how you would define the top of nnetPred to use variable arguments:
nnetPred <- function(X, Y, ...) {
para <- list(...)
This will work in your case, but is not really the best way to define that function, because it looks like you have a finite number of parameters. Only when you have an unknown number of parameters should you use variable argument lists. I would recommend simply putting your parameters in the parameter list. You can rename them if you want to:
nnetPred <- function(X, Y, step_size, reg, h, niteration) {
W <- step_size
b <- reg
W2 <- h
b2 <- niteration
try <- emd(xt2, tt2, boundary="wave")
Error in emd(xt2, tt2, boundary = "wave") :
unused argument (boundary = "wave")

Plot multiple graphs in one figure using a loop

I need to compute the efficient frontier with different risk measure and to use a bootstrapping technique to simulate possible outcome. However, now I'm stuck: what I want to do is to generate via a loop (which will be integrated later into a function) multiple efficient frontier, each one associated to a possible future outcome, and to plot them on the same figure in such a way to see how they may change as the simulation goes on. Here is the loop that I wrote so far:
for (i in 1:B) {
idx <- sample(1:N, N, replace = TRUE)
new.x <- x[idx, ]
µ.b <- apply(X = new.x, 2, FUN = mean)
range.b[, i] <- seq(from = min(µ.b), to = max(µ.b), length.out = steps)
sigma.b <- apply(X = new.x, 2, FUN = sd)
riskCov.b[, i] <- sapply(range.b[, i], function(targetReturn) {
w <- MV_QP(new.x, targetReturn, Sigma)
sd(c(new.x %*% w))
})
xlim.b <- range(c(sigma.b, riskCov.b[, 1]), na.rm = TRUE)
ylim.b <- range(µ.b)
par(new = TRUE)
plot(x = riskCov.b[, i], y = range.b[, i], type = "l", xlim = xlim.b, ylim = ylim.b, xlab = "Risk", ylab = "Return", main = "Resampling EFs")
}
but the problem is that the elements on the x and y axis are rewriting each time the loop runs. How can this problem be solved?
I don't nknow if the optimization is correct. For ploting you can try the following:
for (i in 1:B) {
idx <- sample(1:N, N, replace = TRUE)
new.x <- x[idx, ]
µ.b <- apply(X = new.x, 2, FUN = mean)
range.b[, i] <- seq(from = min(µ.b), to = max(µ.b), length.out = steps)
#sigma.b <- apply(X = new.x, 2, FUN = sd)
riskCov.b[, i] <- sapply(range.b[, i], function(targetReturn) {
w <- MV_QP(new.x, targetReturn,Sigma=cov(new.x))
sd(c(new.x %*% w))
})
}
xlim.b <- range(c(apply(X = x, 2, FUN= sd), riskCov.b), na.rm = TRUE) *c(0.98,1.02)
ylim.b <- range(µ.b) *c(0.98,1.02)
#par(new = TRUE)
for (i in 1:B){
if (i==1) plot(x = riskCov.b[, i], y = range.b[, i], type = "l", xlim = xlim.b, ylim = ylim.b, xlab = "Risk", ylab = "Return", main = "Resampling EFs") else
lines(x = riskCov.b[, i], y = range.b[, i],col=rainbow(B)[i])
}
Depending on your data, you should end up with a similar plot:

Resources