How can I repeat these two lines of code 100+ times? - r

I'm still new to the programming world and looking for some guidance on a model I am building for individual animal growths over time.
The goal for the code I'm working with is to
i) Generate random starting sizes of animals from a given distribution
ii) Give each of these individuals a starting growth rate from a given distribution
iii) Calculate new size of individual after 1 year
iv) Assign a new growth rate from above distribution
v) Calculate the new size of individual after another year.
So far I have the code below, and what I want to do is repeat the last two lines of code x amount of times without I having to physically run the code over and over.
# Generate starting lengths
lengths <- seq(from=4.4, to=5.4, by =0.1)
# Generate starting ks (growth rate)
ks <- seq(from=0.0358, to=0.0437, by =0.0001)
#Create individuals
create.inds <- function(id = NaN, length0=NaN, k1=NaN){
inds <- data.frame(id=id, length0 = length0, k1=k1)
inds
}
# Generate individuals
inds <- create.inds(id=1:n.initial,
length=sample(lengths,100,replace=TRUE),
k1=sample(ks, 100, replace=TRUE))
# Calculate new lengths based on last and 2nd last columns and insert into next column
inds[,ncol(inds)+1] <- 326*(1-exp(-(inds[,ncol(inds)])))+
(inds[,ncol(inds)-1]*exp(-(inds[,ncol(inds)])))
# Calculate new ks and insert into last column
inds[,ncol(inds)+1] <- sample(ks, 100, replace=TRUE)
Any and all assistance would be appreciated, also if you think there is a better way to write this please let me know.

i think what you are asking is a simple loop:
for (i in 1:100) { #replace 100 with the desired times you want this to excecute
inds[,ncol(inds)+1] <- 326*(1-exp(-(inds[,ncol(inds)])))+
(inds[,ncol(inds)-1]*exp(-(inds[,ncol(inds)])))
# Calculate new ks and insert into last column
inds[,ncol(inds)+1] <- sample(ks, 100, replace=TRUE)
}

Related

How to iterate a given process 1'000 times and average the results

I am here to ask you about R language and how to construct a loop to iterate some functions several times.
Here is my problem: I have a numeric matrix obtained from previous analyses (matrix1) that I want to compare (using the overlap function that results in a single value) with another numeric matrix that I get by extracting values of a given raster with a set of randomly created points, as many as the values in the first numeric matrix.
I want to repeat the random sampling procedure 1'000 times, in order to get 1'000 different sets of random points, then repeat the comparison with matrix1 1'000 times (one for each set of random points), and, in the end, calculate the mean of the results to get a single value.
Hereafter I give you an example of the functions I want to use:
#matrix1 is the first matrix, obtained before starting the potential loop;
#LineVector is a polyline shapefile that has to be used within the loop and downloaded before it;
#Raster is a raster from which I should extract values at points location;
#The loop should start from here:
Random_points <- st_sample(LineVector, size = 2000, exact = TRUE, type = "random")
Random_points <- Random_points[!st_is_empty(Random_points)]
Random_points_vect <- vect(Random_points)
Random_values <- terra::extract(Raster, Random_points_vect, ID = F, raw = T)
Random_values <- na.omit(Random_values[, c("Capriolo")])
Values_list <- list(matrix1, Random_values)
Overlapping_value <- overlap(Values_list, type = "2")
#This value, obtained 1'000 times, has then to be averaged into a single number.
I hope I have posed my question in a clear and understandable manner, and I hope you can help me with this problem.
Thanks to everyone in advance, I wish you a good day!
Easy way i can figure out is to use "replicate":
values <- replicate(1000, {
Random_points <- st_sample(LineVector, size = 2000, exact = TRUE, type = "random")
Random_points <- Random_points[!st_is_empty(Random_points)]
Random_points_vect <- vect(Random_points)
Random_values <- terra::extract(Raster, Random_points_vect, ID = F, raw = T)
Random_values <- na.omit(Random_values[, c("Capriolo")])
Values_list <- list(matrix1, Random_values)
Overlapping_value <- overlap(Values_list, type = "2")
Overlapping_value
})
mean(values)

how to create a for loop in R to run a simple random sample and calculate the average of each set

data = read.csv(file= "~/Downloads/data.csv")
temp=(data$temp)
n=75
N=length(temp)
s=sample(1:N, n)
ybar=mean(temp[s])
I want to run the sample 100 times where n is 75. Then calculate average of each sample, and subtract each average from a set number (50).
Maybe a short loop is the way to go. Notice the bracket on the left-side of the equals sign in the last line of code - it's the key to using a loop for your calculation!
# set a seed - always a good idea when using randomness like 'sample()'
set.seed(123)
# pre-allocate an "empty" vector to fill in with results
ybar_vec = vector(length=n)
# do your calculation "n" times
for(i in 1:n) {
s = sample(N)
ybar_vec[i] = 50 - mean(temp[s]) # store i^th calc as i^th element of ybar_vec
}

MCMC in R Modify Proposal

I've been working with MCMC for population genetics and I have some doubts.
I'm not experienced in statistics and because of that I have difficulty.
I have code to run MCMC, 1000 iterations. I start by creating a matrix with 0's (50 columns = 50 individuals and 1000 lines for 1000 iterations).
Then I create a random vector to substitute the first line of the matrix. This vector has 1's and 2's, representing population 1 or population 2.
I also have genotype frequencies and the genotypes of the 50 individuals.
What I want is to, according to the genotype frequencies and genotypes, determine to what population an individual belongs.
Then, I'll keep changing the population assigned to a random individual and checking if the new value should be accepted.
niter <- 1000
z <- matrix(0,nrow=niter,ncol=ncol(targetinds))
z[1,] <- sample(1:2, size=ncol(z), replace=T)
lhood <- numeric(niter)
lhood[1] <- compute_lhood_K2(targetinds, z[1,], freqPops)
accepted <- 0
priorz <- c(1e-6, 0.999999)
for(i in 2:niter) {
z[i,] <- z[i-1,]
# propose new vector z, by selecting a random individual, proposing a new zi value
selind <- sample(1:nind, size=1)
# proposal probability of selecting individual at random
proposal_ratio_ind <- log(1/nind)-log(1/nind)
# propose a new index for the selected individual
if(z[i,selind]==1) {
z[i,selind] <- 2
} else {
z[i,selind] <- 1
}
# proposal probability of changing the index of individual is 1/2
proposal_ratio_cluster <- log(1/2)-log(1/2)
propratio <- proposal_ratio_ind+proposal_ratio_cluster
# compute f(x_i|z_i*, p)
# the probability of the selected individual given the two clusters
probindcluster <- compute_lhood_ind_K2(targetinds[,selind],freqPops)
# likelihood ratio f(x_i|z_i*,p)/f(x_i|z_i, p)
lhoodratio <- probindcluster[z[i,selind]]-probindcluster[z[i-1,selind]]
# prior ratio pi(z_i*)/pi(z_i)
priorratio <- log(priorz[z[i,selind]])-log(priorz[z[i-1,selind]])
# accept new value according to the MH ratio
mh <- lhoodratio+propratio+priorratio
# reject if the random value is larger than the MH ratio
if(runif(1)>exp(mh)) {
z[i,] <- z[i-1,] # keep the same z
lhood[i] <- lhood[i-1] # keep the same likelihood
} else { # if accepted
lhood[i] <- lhood[i-1]+lhoodratio # update the likelihood
accepted <- accepted+1 # increase the number of accepted
}
}
It is asked that I have to change the proposal probability so that the new proposed values are proportional to the likelihood. This leads to a Gibbs sampling MCMC algorithm, supposedly.
I don't know what to change in the code to do this. I also don't understand very well the concept of proposal probability and how to chose the prior.
Grateful if someone knows how to clarify my doubts.
Your current proposal is done here:
# propose a new index for the selected individual
if(z[i,selind]==1) {
z[i,selind] <- 2
} else {
z[i,selind] <- 1
}
if the individual is assigned to cluster 1, then you propose to switch assignment deterministically by assigning them to cluster 2 (and vice versa).
You didn't show us what freqPops is, but if you want to propose according to freqPops then I believe the above code has to be replaced by
z[i,selind] <- sample(c(1,2),size=1,prob=freqPops)
(at least that is what I understand when you say you want to propose based on the likelihood - however, that statement of yours is unclear).
For this now to be a valid mcmc gibbs sampling algorithm you also need to change the next line of code:
proposal_ratio_cluster <- log(freqPops[z[i-1,selind]])-log(fregPops[z[i,selind]])

Simulation in R, for loop

I am trying to simulate the data for 10 times in R but I did not figure out how to achieve that. The code is shown below, you could run it in R straightway! When I run it, it will give me 5 numbers of "w" as output, I think this is only one simulation, but actually what I want is 10 different simulations of that 5 numbers.
I know I will need to write a for loop for it but I did not get that, could anyone help please?
# simulate 10 times
# try N = 10, for loop?
# initial values w0 and E
w0=1000
E= 1000
data = c(-0.02343731, 0.045509474 ,0.076144158,0.09234636,0.0398257)
constant = exp(cumsum(data))
exp.cum = cumsum(1/constant)
w=constant*(W0 - exp.cum)- E
w
You'll want to generate new values of data in each simulation. Do this within the curly brackets that follow the for loop. Then, before closing the curly brackets, be sure to save your statistical output in the appropriate position in a object, like a vector. For a simple example,
W0=1000
E= 1000
n_per_sim <- 5
num_sims <- 10
set.seed(12345) #seed is necessay for reproducibility
sim_output_1 <- rep(NA, times = num_sims) #This creates a vector of 10 NA values
for (sim_number in 1:num_sims){ #this starts your for loop
data <- rnorm(n=n_per_sim, mean=10, sd=2) #generate your data
average <- mean(data)
sim_output_1[sim_number] <- average #this is where you store your output for each simulation
}
sim_output_1 #Now you can see the average from each simulation
Note that if you want to save five values from each simulation, you can make use a matrix object instead of a vector object, as shown here
matrix_output <- matrix(NA, ncol=n_per_sim, nrow=num_sims) #This creates a 10x5 matrix
for (sim_number in 1:num_sims){ #this starts your for loop
data <- rnorm(n=n_per_sim, mean=10, sd=2) #generate your data
constant = exp(cumsum(data))
exp.cum = cumsum(1/constant)
w=constant*(W0 - exp.cum)- E
matrix_output[sim_number, ] <- w #this is where you store your output for each simulation
}
matrix_output #Now you can see the average from each simulation

How to Generate Normal Random Samples within Mean±3Sigma

I want to draw normal random numbers in an array of order ((100*8)*5000) with a specific Mean (M) and Standard Deviation (S) but I want them to be only within the range M±3S, so that I don't have any outliers in my array exceeding those limits.
Any Suggestion? I want to write a program in R based on this array for some simulation studies. I am using following R Code to generate my Data Set:
for(i in 1:5000){
for(j in 1:8){
Dat[,j,i]=rnorm(100,mean=muu[j],sd=sigma[j])
}
}
Now, We want to get rid of those values which are higher than muu±3sigma in the above data. Definitely, We have to replace discarded values with fresh values so that the dimension of the Dat array keep intact.
First Solution
Here is a start but I bet there is a more elegant solution.
First generate a sample next step is to subset it to your desired values. Of course you have to adjust values to your desire.
set.seed(123)
rs <- rnorm(10000, mean = 10, sd = 3)
rs1 <- rs[ rs >= -19 & rs <= 19 ]
Second (better) solution
I think my first solutions didn't work so well. I have just written some code that might be perfect for your purposes. Here are the steps.
create an array of NAs with the required dimensions
fill it with random numbers
create a logical vector where TRUEs are for the desired conditions
subset the data based on that vector and replace the values where TRUE is TRUE (pardon my words game) with the mean used to generate samples
data <- array(NA, dim = c(100, 8, 5000))
for(i in 1:5000){
data[ , , i] <- rnorm(800, 3, 1)
}
bound <- 3 + c(-1, 1)*3*1
pr <- data <= bound[1] | data >= bound[2]
data[pr] <- 3

Resources