Repeating a for loop in R - r

Suppose I have a 10 x 10 matrix. I want to randomly choose 2 numbers from each column and take the square of the difference of these numbers. I wrote the R code for that and I get 10 values, but I wish to repeat this, say, 100 times, in which case I need to get 100*10=1000 numbers. How could I do that?
x <- rnorm(100)
m <- 10
n <- 10
X <- matrix(x,m,n)
for (i in 1:m ) {
y <- sample(X[,i],2,rep=F)
q2[i] <- (y[1]-y[2])^2
}

Or as #Davide Passaretti and #nrussell mentioned in the comments, you can use replicate
f1 <- function(x, m){
q2 <- vector(mode='numeric', length= m)
for(i in 1:m){
y <- sample(x[,i], 2, rep=FALSE)
q2[i] <- (y[1]-y[2])^2
}
q2
}
n <- 100
res <- replicate(100, f1(X, m))
prod(dim(res))
#[1] 1000

Related

While-Loop to generate a sample

Hellou
I've had problems with the following while loop in R. I try to know with what number of samples (n), I can achieve a variance less than 0.01 (dtest) and that let me to know the values of n, m, s and d:
n <- 100
x <- rnorm(n,0,1)
sd(x)
d <- sd(x)/sqrt(n)
dtest <- 0.01
while(dtest <=0.01) {
x <- rnorm(n,0,1)
n <- n+1
m <- mean(x)
s <- sd(x)
d <- s/sqrt(n)
return(output <- data.frame(n,m,s,d))
}
The first time I did the cycle without problems and it marked a n of approx 27K. Now only every time I execute the loop it accumulates
There are a number of issues:
Your condition should compare d to dtest. Currently, it’s comparing two values that aren’t changed within the loop, so will run forever.
Increment n at the start of the loop. Otherwise you’re using a different n to compute x and d.
Just create your results dataframe once, after the loop, rather than creating and discarding with each loop. And don’t use return(), which is meant for use inside functions.
Note that sd(x)/sqrt(n) is standard error, not variance. Variance would be sd(x)^2.
set.seed(13)
n <- 99
x <- rnorm(n,0,1)
d <- sd(x)/sqrt(n)
dtest <- 0.01
while(dtest <= d) {
n <- n+1
x <- rnorm(n,0,1)
s <- sd(x)
d <- s/sqrt(n)
}
output <- data.frame(n,m = mean(x),s,d)
output
n m s d
1 9700 0.01906923 0.9848469 0.009999605

Part 2: How to make nested for-loop do every permutation

This is an update to the first question I asked. I essentially am still missing the pretty obvious learning lesson and can't expand the code that worked originally.
Trying to store a nested loop that runs all permutations of calculations.
I'm missing understanding on how to set up the index still. I am now trying to set this up with three sets of loops (and really, I want to set this up so I understand how to do it with any # of them). The code below doesn't exactly work, the third loop doesn’t index over every combination (but I get 1000 calculations anyway) I commented the part that is clearly wrong...
With only two loops, this was sufficient, per the answer of my first post:
index <- 10*(i-1) + j
So not sure why the way I changed it for 3 loops doesn't work, but it's obviously wrong.
iter = 10 #length of parameter ranges to test
perm = 3 #how many parameters are being varied
n_c <- 1
m_c <- 1
n_n <- 1
m_n <- 1
n_v <- 1
my_data_c <- vector("numeric", iter^perm)
my_data_n <- vector("numeric", iter^perm)
my_data_v <- vector("numeric", iter^perm)
rho_c_store <- vector("numeric", iter)
rho_n_store <- vector("numeric", iter)
rho_v_store <- vector("numeric", iter)
for (i in 1:iter) {
# you can move this assignment to the outer loop
rho_c <- (i / 10)
x <- (rho_c * n_c)/m_c
for (j in 1:iter) {
rho_n <- (j / 10)
y <- (rho_n * n_n)/m_n
for (k in 1:iter){
rho_v <- (k / 10)
z <- rho_v/n_v
index <- iter*(i-2)+j+ k #Clearly where the error is
rho_c_store[index] <- rho_c
rho_n_store[index] <- rho_n
rho_v_store[index] <- rho_v
my_data_c[index] <- x
my_data_n[index] <- y
my_data_v[index] <- z
}
}
}
my_data <- cbind(rho_c_store, rho_n_store, rho_v_store, my_data_c, my_data_n,my_data_v)
print(my_data)
Think of it as if you were counting: the ones move the fastest, next come the tens, then the hundredes. 001, 002, ..., 010, ..., 099, 100, 101. You can think of your loop variables ijk like the digits of a number - k moves the fastest, j slower and i slower still. To get the right index you have to multiply the i with 100, the j with 10 and the k with one, givin you the index 100 * (i-1) + 10 * (j-1) + k. The -1 for i and j is necessary because the loop variable starts from 1, but we want to start by adding 0 * 100 and 0 * 10.
So all you need to do is change your index calculation to index <- iter^2*(i-1)+ iter*(j-1) + k.
# I stripped the example to the essentials for easier understanding
iter = 10 #length of parameter ranges to test
perm = 3 #how many parameters are being varied
my_data_c <- vector("numeric", iter^perm)
my_data_n <- vector("numeric", iter^perm)
my_data_v <- vector("numeric", iter^perm)
for (i in 1:iter) {
x <- i / 10
for (j in 1:iter) {
y <- j / 10
for (k in 1:iter){
z <- k / 10
index <- iter^2*(i-1)+ iter*(j-1) + k
my_data_c[index] <- x
my_data_n[index] <- y
my_data_v[index] <- z
}
}
}
my_data <- cbind(my_data_c, my_data_n, my_data_v)
print(my_data)
Anyway, while it's great to develop an deep understanding of these things you're certainly on the right track when you move away from looping with explicit indices and use R's toolkit for such tasks.
Hope this helps
In case anyone else wants to try this...expand.grid is definitely the way to go. Still not sure how to deal with the nested loop from a coding perspective, but oh well..
n_c <- 1
m_c <- 1
n_n <- 1
m_n <- 1
n_v <- 1
c = c(1,2)
n = c(10,20)
v = c(100,200)
parm_length = 3
iter = length(c)
test <- data.frame(matrix(ncol=parm_length, nrow=iter^parm_length))
test[] <- expand.grid(c,n,v)
x <- (test[,1] * n_c)/m_c
y <- (test[,2]* n_n)/m_n
z <- (test[,3]/n_v)
group <- cbind(test,x,y,z)
print(group)

How can I fill a matrix with a repeat loop and delete columns by condition in R?

My R-Code:
l <- list()
for(i in 1:5){
n <- 1
mat <- matrix(0L,500,10)
repeat{
a <- rnorm(10)
b <- rnorm(10)
c <- a+b
mat[n,] <- c
mat <- mat[mat[,10] >= 0 + (i/10) & mat[,1] >= 0 +(i/10),]
n <- n +1
if(mat[500,] != 0){
break
}
}
l[[i]] <- mat
}
l
I would like to get 5 Matrices, which are stored in a list. Each matrix should have exactly 500 rows and should not have negative values in its rows at position [,1] or [,10].
I tried to build a repeat loop:
Calculate Vector
Store vector in matrix
delete if condition is met
repeat if there arent 500 rows
Unfortunately, there's something wrong and it doesn't work. What can I do? Thanks!
If you add an if-clause that tests your condition before adding the line to your matrix, it should work:
l <- list()
for(i in 1:5){
n <- 1
mat <- matrix(0L,500,10)
repeat{
a <- rnorm(10)
b <- rnorm(10)
c <- a+b
if(!any(c[c(1,10)] < 0 + i/10)){
mat[n,] <- c
n <- n +1
}
if(n==501){
break
}
}
l[[i]] <- mat
}

Is it possible to use vector math in R for a summation involving intervals?

Title's a little rough, open to suggestions to improve.
I'm trying to calculate time-average covariances for a 500 length vector.
This is the equation we're using
The result I'm hoping for is a vector with an entry for k from 0 to 500 (0 would just be the variance of the whole set).
I've started with something like this, but I know I'll need to reference the gap (i) in the first mean comparison as well:
x <- rnorm(500)
xMean <-mean(x)
i <- seq(1, 500)
dfGam <- data.frame(i)
dfGam$gamma <- (1/(500-dfGam$i))*(sum((x-xMean)*(x[-dfGam$i]-xMean)))
Is it possible to do this using vector math or will I need to use some sort of for loop?
Here's the for loop that I've come up with for the solution:
gamma_func <- function(input_vec) {
output_vec <- c()
input_mean <- mean(input_vec)
iter <- seq(1, length(input_vec)-1)
for(val in iter){
iter2 <- seq((val+1), length(input_vec))
gamma_sum <- 0
for(val2 in iter2){
gamma_sum <- gamma_sum + (input_vec[val2]-input_mean)*(input_vec[val2-val]-input_mean)
}
output_vec[val] <- (1/length(iter2))*gamma_sum
}
return(output_vec)
}
Thanks
Using data.table, mostly for the shift function to make x_{t - k}, you can do this:
library(data.table)
gammabar <- function(k, x){
xbar <- mean(x)
n <- length(x)
df <- data.table(xt = x, xtk = shift(x, k))[!is.na(xtk)]
df[, sum((xt - xbar)*(xtk - xbar))/n]
}
gammabar(k = 10, x)
# [1] -0.1553118
The filter [!is.na(xtk)] starts the sum at t = k + 1, because xtk will be NA for the first k indices due to being shifted by k.
Reproducible x
x <- c(0.376972124936433, 0.301548373935665, -1.0980231706536, -1.13040590360378,
-2.79653431987176, 0.720573498411587, 0.93912102300901, -0.229377746707471,
1.75913134696347, 0.117366786802848, -0.853122822287008, 0.909259181618213,
1.19637295955276, -0.371583903741348, -0.123260233287436, 1.80004311672545,
1.70399587729432, -3.03876460529759, -2.28897494991878, 0.0583034949929225,
2.17436525195634, 1.09818265352131, 0.318220322390854, -0.0731475581637693,
0.834268741278827, 0.198750636733429, 1.29784138432631, 0.936718306241348,
-0.147433193833294, 0.110431994640128, -0.812504663900505, -0.743702167768748,
1.09534507180741, 2.43537370755095, 0.38811846676708, 0.290627670295127,
-0.285598287083935, 0.0760147178373681, -0.560298603759627, 0.447188372143361,
0.908501134499943, -0.505059597708343, -0.301004012157305, -0.726035976548133,
-1.18007702699501, 0.253074712637114, -0.370711296884049, 0.0221795637601637,
0.660044122429767, 0.48879363533552)

Basic loop index in a function

I want to add a loop to the code below so that it runs four times for
n <- c(1000, 10000, 100000, 1000000)
And to return a matrix that contains n values and its solution pi? Thanks!
Here is my code for a single value of n:
n <- 1000
x <- c(runif(n, -1,1))
y <-c(runif(n, -1,1))
points <- data.frame(cbind(x,y))
z <- points$x^2 + points$y^2
pi <- function(n,points){
y <- 4*length(z[z<=1])/n
return(y)
}
pi(n, points)
here is a way where you use an implicit loop (sapply) instead of a for loop:
calc_pi <- function(n){
x <- c(runif(n, -1,1))
y <-c(runif(n, -1,1))
points <- data.frame(cbind(x,y))
z <- points$x^2 + points$y^2
pi <- function(n,points){
y <- 4*length(z[z<=1])/n
return(y)
}
pi(n, points)
}
n <- c(1000, 10000, 100000, 1000000)
set.seed(1)
data.frame(n = n, pi = sapply(n, calc_pi))
n pi
1 1e+03 3.080000
2 1e+04 3.141600
3 1e+05 3.137640
4 1e+06 3.143064
Note that it is good practice to set a random seed with set.seed when working with random numbers (see e.g. this question).

Resources