Using random matrix in the loop (R) - r

I am trying to create an empirical histogram of eigenvalue spacing for random matrices using loop. Seems simple but not working so far... I am getting this error: "Error in M[k] <- (c(x[k], y[k], y[k], z[k])) :
number of items to replace is not a multiple of replacement length.”
I tried writing M[ ,k] but I still got the same error. If anyone can help me with that line, it would be great! Here is my code:
x <- rnorm(1000,0,1)
y <- rnorm(1000,0,1/2)
z <- rnorm(1000,0,1)
M <- matrix(0,2,2)
a <- rep(0,1000)
b <- rep(0,1000)
s <- rep(0,1000)
for(k in 1:1000){
M[k] =(c(x[k],y[k],y[k],z[k]))
temp = eigen(M[k])$value
a[k] <- max(temp)
b[k] <- min(temp)
s[k] <- a[k]-b[k]
}

If you are only interested in creating s, you can make your code considerably simpler by using sapply instead of a loop. You create the matrix M for each iteration and return the difference between the maximum and minimum eigenvalue. This will make your vector s without all the intermediate variables.
set.seed(69) # Makes the example reproducible
x <- rnorm(1000, 0, 1)
y <- rnorm(1000, 0, 1/2)
z <- rnorm(1000, 0, 1)
s <- sapply(seq(1000), function(k) {
M <- matrix(c(x[k], y[k], y[k], z[k]), 2, 2)
max(eigen(M)$value) - min(eigen(M)$value)
})
hist(s)
You can even get rid of x, y, z if you just sample as you go:
set.seed(69) # Makes the example reproducible
s <- sapply(seq(1000), function(k) {
M <- matrix(c(rnorm(1), rep(rnorm(1, 0, 1/2), 2), rnorm(1)), 2, 2)
max(eigen(M)$value) - min(eigen(M)$value)
})
hist(s)

Related

Make datasets by loop in R

I'm trying to learn how to make a loop in R
I have this:
sigma2 <- 0.4
a0 <- -0.1260805
b <- 0.1260805
tt <- 1:50, 1:50
z <- rnorm(50, 0, sigma2)
y <- rep(1, 50)
for(i in 1:50){
y[i]=exp(a0 + b*tt[i])*exp(z[i])
}
y
and I want to kind of test the code above 1000 times, since I want to test the hypothesis at the 0.05 level
can I treid this, and seens to be wrong:
aa <- rep(1, 1000)
for(i in 1:1000){
y[i]=exp(a0 + b1*tt[i])*exp(z[i])
}
Thanks for help!
I think this is what you want (or at least closer):
# re-write original code with vectorization:
n <- 50
sigma2 <- 0.4
a0 <- -0.1260805
b <- 0.1260805
tt <- 1:n
z <- rnorm(n, 0, sigma2)
y <- exp(a0 + b*tt)*exp(z)
# do it 20 times
result <- replicate(20, exp(a0 + b*tt)*exp(rnorm(n, 0, sigma2)))
result is a 50x20 matrix - one column per repetition.

Is it possible to use vector math in R for a summation involving intervals?

Title's a little rough, open to suggestions to improve.
I'm trying to calculate time-average covariances for a 500 length vector.
This is the equation we're using
The result I'm hoping for is a vector with an entry for k from 0 to 500 (0 would just be the variance of the whole set).
I've started with something like this, but I know I'll need to reference the gap (i) in the first mean comparison as well:
x <- rnorm(500)
xMean <-mean(x)
i <- seq(1, 500)
dfGam <- data.frame(i)
dfGam$gamma <- (1/(500-dfGam$i))*(sum((x-xMean)*(x[-dfGam$i]-xMean)))
Is it possible to do this using vector math or will I need to use some sort of for loop?
Here's the for loop that I've come up with for the solution:
gamma_func <- function(input_vec) {
output_vec <- c()
input_mean <- mean(input_vec)
iter <- seq(1, length(input_vec)-1)
for(val in iter){
iter2 <- seq((val+1), length(input_vec))
gamma_sum <- 0
for(val2 in iter2){
gamma_sum <- gamma_sum + (input_vec[val2]-input_mean)*(input_vec[val2-val]-input_mean)
}
output_vec[val] <- (1/length(iter2))*gamma_sum
}
return(output_vec)
}
Thanks
Using data.table, mostly for the shift function to make x_{t - k}, you can do this:
library(data.table)
gammabar <- function(k, x){
xbar <- mean(x)
n <- length(x)
df <- data.table(xt = x, xtk = shift(x, k))[!is.na(xtk)]
df[, sum((xt - xbar)*(xtk - xbar))/n]
}
gammabar(k = 10, x)
# [1] -0.1553118
The filter [!is.na(xtk)] starts the sum at t = k + 1, because xtk will be NA for the first k indices due to being shifted by k.
Reproducible x
x <- c(0.376972124936433, 0.301548373935665, -1.0980231706536, -1.13040590360378,
-2.79653431987176, 0.720573498411587, 0.93912102300901, -0.229377746707471,
1.75913134696347, 0.117366786802848, -0.853122822287008, 0.909259181618213,
1.19637295955276, -0.371583903741348, -0.123260233287436, 1.80004311672545,
1.70399587729432, -3.03876460529759, -2.28897494991878, 0.0583034949929225,
2.17436525195634, 1.09818265352131, 0.318220322390854, -0.0731475581637693,
0.834268741278827, 0.198750636733429, 1.29784138432631, 0.936718306241348,
-0.147433193833294, 0.110431994640128, -0.812504663900505, -0.743702167768748,
1.09534507180741, 2.43537370755095, 0.38811846676708, 0.290627670295127,
-0.285598287083935, 0.0760147178373681, -0.560298603759627, 0.447188372143361,
0.908501134499943, -0.505059597708343, -0.301004012157305, -0.726035976548133,
-1.18007702699501, 0.253074712637114, -0.370711296884049, 0.0221795637601637,
0.660044122429767, 0.48879363533552)

For loops for nested variables within function in R

I would like to iterate through vectors of values and calculate something for every value while being within a function environment in R. For example:
# I have costs for 3 companies
c <- c(10, 20, 30)
# I have the same revenue across all 3
r <- 100
# I want to obtain the profits for all 3 within one variable
result <- list()
# I could do this in a for loop
for(i in 1:3){
result[i] <- r - c[i]
}
Now lets assume I have a model that is very long and I define everything as a function which is to be solved with various random draws for the costs.
# Random draws
n <- 1000
r <- rnorm(n, mean = 100, sd = 10)
c1 <- rnorm(n, mean = 10, sd = 1)
c2 <- rnorm(n, mean = 20, sd = 2)
c3 <- rnorm(n, mean = 30, sd = 3)
X <- data.frame(r, c1, c2, c3)
fun <- function(x){
r <- x[1]
c <- c(x[2], x[3], x[4])
for(i in 1:3){
result[i] <- r - c[i]
}
return(result)
}
I could then evaluate the result for all draws by iterating through the rows of randomly sampled input data.
for(j in 1:n){
x <- X[j,]
y <- fun(x)
}
In this example, the output variable y would entail the nested result variable which comprises of the results for all 3 companies. However, my line of thinking results in an error and I think it has to do with the fact that I try to return a nested variable? Hence my question how you guys would approach something like this.
I would suggest rethinking your coding approach. This is a very un-R-like way of doing things.
For example, the first for loop can be written much more succinctly as
x <- c(10, 20, 30)
r <- 100
result <- lapply(-x, `+`, r)
Then fun becomes something like
fun <- function(x) lapply(-x[-1], `+`, x[1])
To then operate over the rows of a data.frame (which is what you seem to do in the last step), you can use something like
apply(X, 1, fun)
where the MARGIN = 1 argument in apply ensures that you are applying a function per row (as opposed to per column).
Here's an approach using your function and a for loop:
# Random draws
n <- 1000
r <- rnorm(n, mean = 100, sd = 10)
c1 <- rnorm(n, mean = 10, sd = 1)
c2 <- rnorm(n, mean = 20, sd = 2)
c3 <- rnorm(n, mean = 30, sd = 3)
X <- data.frame(r, c1, c2, c3)
result <- list()
fun <- function(x){
r <- x[[1]]
c <- c(x[[2]], x[[3]], x[[4]])
for(i in 1:3){
result[i] <- r - c[i]
}
return(result)
}
# Create a list to store results
profits <- rep(rep(list(1:3)),nrow(X))
# Loop throuhg each row of dataframe and store in profits.
for(i in 1:nrow(X)){
profits_temp <-
fun(list(X[i,"r"],X[i,"c1"],X[i,"c2"],X[i,"c3"]))
for(j in 1:3)
profits[[i]][[j]] <- profits_temp[[j]]
}
# Eye results
profits[[1]]
#> [1] 93.23594 81.25731 70.27699
profits[[2]]
#> [1] 80.50516 69.27517 63.36439

How to create a loop to generate increasing sample sizes in a simulation

I'm trying to create a simulation to calculate the confidence interval for a binomial proportion. So far I have a function that calculates the lower and upper bounds and I have generated and stored the type of data I want (in a matrix, I'm not sure about that).
How can I create a loop that generates samples with different sizes. I'd like to test how the formula performs when calculating the intervals with sample sizes n=10, 11, 12,... up to 100.
My code so far:
## functions that calculate lower and upper bounds
ll <- function(x, cl=0.95) {
n <- length(x)
p.est <- mean(x)
z = abs(qnorm((1-cl)/2))
return((p.est) - z*sqrt(p.est*(1-p.est)/n))
}
ul <- function(x, cl=0.95) {
n <- length(x)
p.est <- mean(x)
z = abs(qnorm((1-cl)/2))
return((p.est) + z*sqrt(p.est*(1-p.est)/n))
}
## my simulation for n=10 and 200 repetitions.
p <- 0.4
n <- 10
rep <- 200
dat <- rbinom(rep*n,1,p)
x <- matrix(dat, ncol=rep)
ll.res <- apply(x, 2, ll)
ul.res <- apply(x, 2, ul)
hits <- ll.res <= p & p <= ul.res
sum(hits==1)/rep
I'm not sure which values do you want to compare between different sample sizes. But I guess wrapping your simulation in a for and using lists to store the results should work:
nrep=200
hits=list()
value=NULL
ll.res = list()
ul.res = list()
ns = c(10:100)
for(i in 1:length(ns)){
p <- 0.4
n <- ns[i]
rep <- 200
dat <- rbinom(rep*n,1,p)
x <- matrix(dat, ncol=nrep)
ll.res[[i]] <- apply(x, 2, ll)
ul.res[[i]] <- apply(x, 2, ul,cl=0.95)
hits[[i]] <- ll.res[[i]] <= p & p <= ul.res[[i]]
value[i] = sum(hits[[i]]==1)/rep
}

Plotting a histogram in R

I have to solve the following exercise.
(1) Create 100 Poisson distributed r.v.'s with lambda = 4
(2) Calculate the mean of the sample, generated in (1).
(3) Repeat (1) and (2) 10.000 times.
(4) create a vector, containing the 10.000 means.
(5) plot the vector in a histogram.
Is the following solution(?) right?
> as.numeric(x)
> for(i in 1:10000){
> p <- rpois(100, lambda = 4)
> m <- mean(p)
> append(x, m)
>}
> hist(x, breaks = 20)
It's a little funny. You can quickly do what you ask in more legible ways. For example:
L <- 10000
emptyvector <- rep(NA, L)
for(i in 1:L){
emptyvector[i] <- mean(rpois(100, lambda = 4))
}
hist(emptyvector)
I would have taken advantage of the replicate() function which would create a matrix of results and then run colMeans to quickly get my vector.
meanvector <- colMeans(replicate(10000, rpois(100, lambda = 4)))
hist(meanvector, main = "Mean values from 10,000 runs of \nPoisson n = 100")
hist(replicate(10000, mean(rpois(100, lambda = 4))))
you need to assign x again with the value.
x1 <- x <- NULL
for(i in 1:10000){
p <- rpois(100, lambda = 4)
m <- mean(p)
x[length(x) + 1] <- m
x1 <- append(x1, m)
## X or x1 vector will suffice for histogram
}
hist(x1, breaks = 20)

Resources