How to run a simulation using a for loop - r

Trying to run a simulation in a for loop. The model representing the simulation is :
sim.predation(size = 30, n = 100, time = 100, handling.time = 2, draw.plot=FALSE)
I want to run a for loop on the n variable only from 100:1000

sapply(100:1000, function(x){
res <- sim.predation(size = 30, n=x, time = 100, handling.time = 2, draw.plot=FALSE)
return(res)
})
Or:
for (x in 100:1000) {
sim.predation(size = 30, n=x, time = 100, handling.time = 2, draw.plot=FALSE)
}

Related

Process sets of rasters in parallel using lapp function from terra package

I have groups of rasters that I want to run a function on, I think probably using the lapp function from the {terra} package? Here is a simple example using toy data of the 'style' of thing I am hoping to accomplish.
library("terra")
rp10val = 106520
rp20val = 106520
rp50val = 154250
rp100val = 154250
rp200val = 154250
rp500val = 154250
rp1500val = 154250
sopval = 200
rp_10_vul = rast(nrow = 10, ncol = 10, vals = rep(rp10val, 10))
rp_20_vul = rast(nrow = 10, ncol = 10, vals = rep(rp20val, 10))
rp_50_vul = rast(nrow = 10, ncol = 10, vals = rep(rp50val, 10))
rp_100_vul = rast(nrow = 10, ncol = 10, vals = rep(rp100val, 10))
rp_200_vul = rast(nrow = 10, ncol = 10, vals = rep(rp200val, 10))
rp_500_vul = rast(nrow = 10, ncol = 10, vals = rep(rp500val, 10))
rp_1500_vul = rast(nrow = 10, ncol = 10, vals = rep(rp1500val, 10))
sop_tile = rast(nrow = 10, ncol = 10, vals = rep(sopval, 10))
input_raster_group <- c(rp_10_vul, rp_20_vul, rp_50_vul, rp_100_vul,
rp_200_vul, rp_500_vul, rp_1500_vul, sop_tile)
## In real world each of these lists would have rasters with different data in
input_raster_lists <- list(list(input_raster_group),
list(input_raster_group),
list(input_raster_group))
mcmapply(lapp,
input_raster_lists,
function(a,b,c,d,e,f,g,h){a+b+c+d+e+f+g+h},
mc.cores = 2)
## If working on windows, this might be better to try and run as proof of concept
# mapply(lapp,
# input_raster_lists,
# function(a,b,c,d,e,f,g,h){(a+b-c) / (d+e+f+g+h)})
Simplified data to make this easier to read
library("terra")
r10 = rast(nrow = 10, ncol = 10, vals = 10)
r20 = rast(nrow = 10, ncol = 10, vals = 20)
r50 = rast(nrow = 10, ncol = 10, vals = 50)
group <- c(r10, r20, r50)
input <- list(group, group, group)
You can use lapply to compute lists sequentially
x <- lapply(input, \(i) sum(i))
y <- lapply(input, \(i) app(i, sum))
z <- lapply(input, \(i) lapp(i, function(a,b,c){a+b+c}))
To use parallelization you could use e.g. parallel::parLapply or, as in your case, parallel::mcmapply.
SpatRaster objects hold a pointer (reference) to a C++ object that cannot be passed to a worker. Therefore you would need to use wrap and unwrap as I show below. I use proxy=TRUE to not force values to memory.
library(parallel)
inp <- lapply(input, \(x) wrap(x, proxy=TRUE))
f <- \(i) { unwrap(i) |> sum() |> wrap(proxy=TRUE)}
b <- mcmapply(f, inp)
out <- lapply(b, unwrap)
This approach may be useful in some cases, e.g. when you have to do many simulations on a relatively small raster that is memory.
In most cases you would do parallelization because you are dealing with large rasters that are on disk. In that case, you could just send the filenames to the workers, and create the SpatRasters there (and write the output to disk).
There is more discussion here

Fitting data frame probability distributions with different lengths - EnvStat - looping in R

I'm trying to fit probability distributions in R using EnvStat package and looping to calculate multiple columns at once.
Columns have different lengths and some code error is happening. The data frame does not remain in numeric format.
Error message: 'x' must be a numeric vector
I couldn't identify the error. Could anyone help?
Many thanks
Follow code:
x = runif(n = 50, min = 1, max = 12)
y = runif(n = 70, min = 5, max = 15)
z = runif(n = 35, min = 1, max = 10)
m = runif(n = 80, min = 6, max = 18)
length(x) = length(m)
length(y) = length(m)
length(z) = length(m)
df = data.frame(x=x,y=y,z=z,m=m)
df
library(EnvStats)
nproc = 4
cont = 1
dfr = data.frame(variavel = character(nproc),
locationevd= (nproc), scaleevd= (nproc),
stringsAsFactors = F)
# i = 2
for (i in 1:4) {
print(i)
nome.var=colnames(df)
df = df[,c(i)]
df = na.omit(df)
variavela = nome.var[i]
dfr$variavel[cont] = variavela
evd = eevd(df);evd
locationevd = evd$parameters[[1]]
dfr$locationevd[cont] = locationevd
scaleevd = evd$parameters[[2]]
dfr$scaleevd[cont] = scaleevd
cont = cont + 1
}
writexl::write_xlsx(dfr, path = "Results.xls")
Two major changes to you code:
First, use a list instead of a dataframe (so you can accommodate unequal vector lengths):
x = runif(n = 50, min = 1, max = 12)
y = runif(n = 70, min = 5, max = 15)
z = runif(n = 35, min = 1, max = 10)
m = runif(n = 80, min = 6, max = 18)
vl = list(x=x,y=y,z=z,m=m)
vl
if (!require(EnvStats){ install.packages('EnvStats'); library(EnvStats)}
nproc = 4
# cont = 1 Not used
dfr = data.frame(variavel = character(nproc),
locationevd= (nproc), scaleevd= (nproc),
stringsAsFactors = F)
Second: Use one loop index and not use "cont" index
for ( i in 1:length(vl) ) {
# print(i) Not needed
nome.var=names(vl) # probably should have been done before loop
var = vl[[i]]
variavela = nome.var[i]
dfr$variavel[i] = variavela # all those could have been one step
evd = eevd( vl[[i]] ) # ;evd
locationevd = evd$parameters[[1]]
dfr$locationevd[i] = locationevd
scaleevd = evd$parameters[[2]]
dfr$scaleevd[i] = scaleevd
}
Which gets you the desired structure:
dfr
variavel locationevd scaleevd
1 x 5.469831 2.861025
2 y 7.931819 2.506236
3 z 3.519528 2.040744
4 m 10.591660 3.223352

Set Acceptable Region for My Skewness Test in R

I am writing the below function to let me conduct a test of skewness for a vector of samples (10, 20, 50, 100) with a 1000 replicate.
library(moments)
out <- t(sapply(c(10, 20, 50, 100), function(x)
table(replicate(1000, skewness(rgamma(n = x, shape = 3, rate = 0.5))) > 2)))
row.names(out) <- c(10, 20, 50, 100)
out
My conditions
My condition of rejecting the Null hypothesis is that the statistic must fulfil two (2) conditions:
less than -2
or greater than +2.
What I have
But in my R function I can only describe the second condition.
What I want
How do I include both the first and the second condition in my function?
Perhaps adding the abs would be the easiest approach to meet both conditions
out <- t(sapply(c(10, 20, 50, 100), function(x)
table(abs(unlist(replicate(1000, skewness(rgamma(n = x, shape = 3, rate = 0.5))))) > 2)))
row.names(out) <- c(10, 20, 50, 100)
out

Replicate function doesn't work with "on the fly" function

I have the following data.frame:
df_1 <- data.frame(
x = replicate(
n = 6, expr = runif(n = 30, min = 20, max = 100), simplify = TRUE
)
)
I want generate 50 data.frames with this function:
f_1 <- function(x) {
data.frame(x = replicate(n = 5, runif(n = 30, min = 20, max = 100)))
}
lt_1 <- replicate(n = 50, expr = f_1(), simplify = FALSE)
The result is ok. But, when apply f_1 within a function (on the fly), this function doesn't work:
lt_2 <- replicate(
n = 50, expr = function(x) {
data.frame(x = replicate(n = 5, runif(n = 30, min = 20, max = 100)))
}, simplify = FALSE
)
What's problem?
We can wrap it inside the () and call () to execute the function
lt_2 <- replicate(
n = 50, expr = (function(x) {
data.frame(x = replicate(n = 5, runif(n = 30, min = 20, max = 100)))
})(), simplify = FALSE
)
In the OP's lt_1, the function is called with f_1()

Sort a vector where the largest is at the center in r

I know this is a simple question, but I have searched everywhere and I am pretty sure that there is no answer to my question.
I want to sort a vector where the largest is in the middle and goes to to tails when the values go down.
For example:
c( 20, 30, 40, 50, 60)
I want to have:
c(20, 40, 60, 50, 30 ) or c(30, 50, 60, 40, 20 )
It does not matter.
Can anyone offer me a quick solution?
Thanks!
This is much easier to solve if you assume that you have 2n (n is a natural number) distinct observations. Here is one solution:
ints = sample.int(100, size = 30, replace = FALSE)
ints_o = ints[order(ints)]
ints_tent = c(ints_o[seq.int(from = 1, to = (length(ints) - 1), by = 2)],
rev(ints_o[seq.int(from = 2, to = length(ints), by = 2)]))
Edit:
Here is function that deals with both cases:
makeTent = function(ints) {
ints_o = ints[order(ints)]
if((length(ints) %% 2) == 0) {
# even number of observations
ints_tent = c(ints_o[seq.int(from = 1, to = (length(ints) - 1), by = 2)],
rev(ints_o[seq.int(from = 2, to = length(ints), by = 2)]))
} else {
# odd number of observations
ints_tent = c(ints_o[seq.int(from = 2, to = (length(ints) - 1), by = 2)],
rev(ints_o[seq.int(from = 1, to = length(ints), by = 2)]))
}
return(ints_tent)
}
# test the function
ints_even = sample.int(100, size = 30, replace = FALSE)
ints_odd = sample.int(100, size = 31, replace = FALSE)
makeTent(ints_odd)
makeTent(ints_even)

Resources