I need to write a function that performs a simulation to evaluate the coverage of a bootstrap confidence interval for the variance of n samples from a normal distribution. Belowis what I've attempted but it keeps returning a mean of 0 or 0.002 for the number of samples that lie within the CI...
Var_CI_Coverage <- function(true_mean,true_var, nsim, nboot, alpha, nsamples){
cover = NULL
for(k in 1:nsim){
Var = as.numeric()
y <- rnorm(1, mean = true_mean, sd = sqrt(true_var))
for(i in 1:nboot){
resample_y <- sample(y, size = nsamples, replace = TRUE)
Var[i] <- var(resample_y)
}
LB <- quantile(Var, probs=c(alpha/2))
UB <- quantile(Var, probs=c(1 - (alpha/2)))
cover[k] <- ifelse(LB <= true_var & UB >= true_var, 1, 0)
}
return(mean(cover))
}
Var_CI_Coverage(true_mean= 0, true_var = 4, nsim = 500, nboot = 1000, alpha = 0.05, nsamples = 10)
The main problem is you generate y using
y <- rnorm(1, mean = true_mean, sd = sqrt(true_var))
which means y is a single value, and all your bootstrap samples are just that single y value repeated nsamples times. You need
y <- rnorm(nsamples, mean = true_mean, sd = sqrt(true_var))
Then you get samples with actual variance, and you get a coverage estimate that looks more in the right ballpark (no comment on whether it's correct, I haven't tried to check).
Related
Hi everyone im using R to try and simulate some economic models. We do this primarily through the use of the euler equation. I've figured out that applying shocks to values which are defined within the function (in this case it is k is pretty simple as seen in the code below, however I'm interested in applying a shock to parameters like delta, theta and rho.
For what its worth I'm using the R package deSolve. Any help is appreciated.
library('deSolve')
##############################################
#Computing the neoclassical growth model in R#
##############################################
#parameters and state space
A<-1
theta<- 0.1
alpha<-0.5
delta<-0.3
rho<-0.9
kinital <- c(k = 1)
times <- seq(from = 0, to = 100, by = 0.2)
#define euler equation
euler <- function(t, k, parms)
list((1/theta)*alpha*A*k^(alpha-1)-delta-rho)
#Compute
out <- ode(y = kinital, times = times, func = euler,
parms = NULL)
plot(out, main = "Euler equation", lwd = 2)
#########################
#Temporary Capital Shock#
########################
eventdat <- data.frame(var = c("k"),
time = c(30) ,
value = c(10),
method = c("add"))
eventdat1 <- data.frame(var = c("k"),
time = c(30) ,
value = c(-5),
method = c("add"))
out3<-ode(y=kinital,times=times,func=euler,events=list(data=eventdat))
out4<-ode(y=kinital,times=times,func=euler,events=list(data=eventdat1))
plot(out,out3,out4,main="Temporary Shock",lwd=3)
Not a great fix but the way to deal with this type of problem is by conditioning your values to take place over some interval. I do this for depreciation as follows:
##############################
#Temporary Depreciation Shock#
##############################
#New Vars
A<-1
theta<- 0.1
alpha<-0.5
delta<-0.3
rho<-0.9
kinital <- c(k = 17)
times <- seq(from = 0, to = 400, by = 0.2)
#Redefine Euler
euler2<-function(t,k,prams){
list((1/theta)*alpha*A*k^(alpha-1)-delta-rho)}
euler3<-function(t,k,prams){
list((1/theta)*alpha*A*k^(alpha-1)-(delta+0.05*(t>=30&t<=40))-rho)}
#Output
doutbase<-ode(y=kinital,times=times, func=euler2, parms=NULL)
doutchange<-ode(y=kinital,times=times, func=euler3, parms=NULL)
#plots
plot(doutbase,doutchange,main="Change in depreciation at t=30 until t=40",lwd=2)
A colleague off of stackexchange suggested a cleaner bit of code which is a bit cleaner. This is seen below:
A<-1
theta<- 0.1
alpha <- 0.5
rho<-0.9
init <- c(k = 17, delta = 0.3)
times <- seq(from = 0, to = 400, by = 0.2)
euler.function<-function(t,y, prams){
k <- y[1]
delta <- y[2]
dk <- (1/theta)*alpha*A*k^(alpha-1)-delta-rho
list(c(dk, 0))}
deventdat<- data.frame(var = c("delta", "delta"),
time = c(30, 51) ,
value = c(0.1, -0.1),
method = c("add"))
res<-ode(y=init,times=times, func=euler.function, parms=NULL, events=list(data=deventdat))
plot(res,lwd=2)
I am trying to find the MLE estimate of alpha of a beta distribution given beta = 1.
I tried using maxlogL from the estimationtools package but g
x <- rbeta(n = 1000, shape1 = 0.7, shape2 = 1)
alpha_hat <- maxlogL(x = x, dist = "dbeta", fixed = list(shape2 = 1), lower = (0), upper = (1), link = list(over = "shape1", fun = "log_link"))
summary(alpha_hat)
For the normal distributions the following computations do give me an estimate of sd.
x <- rnorm(n = 10000, mean = 160, sd = 6)
theta_1 <- maxlogL(x = x, dist = 'dnorm', control = list(trace = 1),link = list(over = "sd", fun = "log_link"),
fixed = list(mean = 160))
summary(theta_1)
Could someone point out the mistake in the first piece of code?
I don't know. I'm going to be lazy and do it a different way I'm more familiar with:
library(bbmle)
m <- mle2(x~dbeta(shape1=exp(loga),shape2=1),
data=data.frame(x), start=list(loga=0))
## estimate (back-transformed)
exp(coef(m)) ## 0.6731152
## profile confidence interval (back-transformed)
exp(confint(m))
## 2.5 % 97.5 %
## 0.6322529 0.7157005
Setting up a simple bootstrap function ...
bootfun <- function() {
newx <- sample(x,size=length(x),replace=TRUE)
newm <- update(m, data=data.frame(x=newx))
return(coef(newm))
}
set.seed(101)
bootsamp <- replicate(500, bootfun())
exp(quantile(bootsamp, c(0.025, 0.975)))
## 2.5% 97.5%
## 0.6533478 0.7300200
In fact, for this case the (very quick) Wald confidence intervals are probably fine ...
exp(confint(m,method="quad"))
## 2.5 % 97.5 %
## 0.6462591 0.7315456
I want to simulate ARIMA(1,1,0) with varying:
sample sizes
phi values
standard deviation values.
I admire how the bellow r code is simulating just one ARIMA(1,1,0) which I want to follow the format to simulate many ARIMA(1,1,0) with varying sample sizes, phi values and standard deviation values
wn <- rnorm(10, mean = 0, sd = 1)
ar <- wn[1:2]
for (i in 3:10){
ar<- arima.sim(n=10,model=list(ar=-0.7048,order=c(1,1,0)),start.innov=4.1,n.start=1,innov=wn)
}
I have asked a similar question here and given a good answer based on my question, but now I see that arima.sim() function is indispensable in simulating ARIMA time series and therefore want to incorporate it into my style of simulating ARIMA time series.
I come up with this trial that uses arima.sim() function to simulate N=c(15, 20) ARIMA(1,1,0) time series with varying sample sizes, standard deviation values and phi values by first generating N random number and then using the initial two random number to be the first two ARIMA(1,1,0). The 3rd to **n**th are the made to followARIMA(1,1,0)`.
Here is what I have tried bellow:
N <- c(15L, 20L)
SD = c(1, 2) ^ 2
phi = c(0.2, 0.4)
res <- vector('list', length(N))
names(res) <- paste('N', N, sep = '_')
set.seed(123L)
for (i in seq_along(N)){
res[[i]] <- vector('list', length(SD))
names(res[[i]]) <- paste('SD', SD, sep = '_')
ma <- matrix(NA_real_, nrow = N[i], ncol = length(phi))
for (j in seq_along(SD)){
wn <- rnorm(N[i], mean = 0, sd = SD[j])
ar[[1:2, ]] <- wn[[1:2]]
for (k in 3:N[i]){
ar[k, ] <- arima.sim(n=N[[i]],model=list(ar=phi[[k]],order=c(1,1,0)),start.innov=4.1,n.start=1,innov=wn)
}
colnames(ar) <- paste('ar_theta', phi, sep = '_')
res[[i]][[j]] <- ar
}
}
res1 <- lapply(res, function(dat) do.call(cbind, dat))
sapply(names(res1), function(nm) write.csv(res1[[nm]],
file = paste0(nm, ".csv"), row.names = FALSE, quote = FALSE))
The last two lines write the time series data in .csv and save it in my working directory.
Here may be a method using Map. Please edit your post to include expected output if this does not meet your requirements.
N <- c(15L, 20L)
SD <- c(1, 2) ^ 2
phi = c(0.2, 0.4)
## generate all combos
all_combos <- expand.grid(N = N, SD = SD, phi = phi)
## create function
fx_arima <- function(n, SD, phi) {
arima.sim(n = n,
model=list(ar=phi, order = c(1, 1, 0)),
start.innov = 4.1,
n.start = 1,
rand.gen = function(n) rnorm(n, mean = 0, sd = SD))[-1L]
}
## find arima for all combos using Map
set.seed(123L)
res = Map(fx_arima, all_combos[["N"]], all_combos[["SD"]], all_combos[["phi"]])
## or a little bit more work:
set.seed(123L)
res2 = by(all_combos, all_combos["N"],
function(DF) {
res = mapply(fx_arima, DF[["N"]], DF[["SD"]], DF[["phi"]])
colnames(res) = paste("SD", DF[["SD"]], "phi", DF[["phi"]], sep = "_")
res
})
res2
## write to csv
Map(function(file, DF) write.csv(DF, paste0("N_", file, ".csv")), names(res2), res2)
The function needs to return the mean and standard deviation of each sample.
This is what I have:
sample_gamma <- function(alpha, beta, n, iter) {
mean = alpha/beta
var = alpha/(beta)^2
sd = sqrt(var)
gamma = rgamma(n,shape = alpha, scale = 1/beta)
sample_gamma = data.frame(mean = replicate(n = iter, expr = mean))
}
I'm very lost for this. I also need to create a data frame for this function.
Thank you for your time.
Edit:
sample_gamma <- function(alpha, beta, n, iter) {
output <- rgamma(iter, alpha, 1/beta)
output_1 <- matrix(output, ncol = iter)
means <- apply(output_1, 2, mean)
sds <- apply(output_1, 2, sd)
mystats <- data.frame(means, sds)
return(mystats)
}
This works except for the sds. It's returning NAs.
It's not really clear to me what you want. But say you want to create 10 samples of size 1000, alpha = 1, beta = 2. Then you can create a single stream of rgamma realizations, dimension them into a matrix, then get your stats with apply, and finally create a data frame with those vectors:
output <- rgamma(10*1000, 1, 1/2)
output <- matrix(output, ncol = 10)
means <- apply(output, 2, mean)
sds <- apply(output, 2, sd)
mystats <- data.frame(means, sds)
You could wrap your function around that code, replacing the hard values with parameters.
I want to repeatedly sample values based on a certain condition. For example I want to create a sample of 100 values.
With probability of 0.7 it will be sampled from one distribution, and from another probability, otherwise.
Here is a way to do what I want:
set.seed(20)
A<-vector()
for (i in 1:100){
A[i]<-ifelse(runif(1,0,1)>0.7,rnorm(1, mean = 100, sd = 20),runif(1, min = 0, max = 1))
}
I am sure there are other more elegant ways, without using for loop.
Any suggestions?
You can sample an indiactor, which defines what distribution you draw from.
ind <- sample(0:1, size = 100, prob = c(0.3, 0.7), replace = TRUE)
A <- ind * rnorm(100, mean = 100, sd = 20) + (1 - ind) * runif(100, min = 0, max = 1)
In this case you don't use a for-loop but you need to sample more random variables.
If the percentage of times is not random, you can draw the right amount of each distribution then shuffle the result :
n <- 100
A <- sample(c(rnorm(0.7*n, mean = 100, sd = 20), runif(0.3*n, min = 0, max = 1)))