Increment inside for loop? in r - r

I am trying to make 1000 simulations to see how many of my f values are in the reject region which is above 1.48 and below .67.
I have this but the variables don't increment as they should:
for (k in 1:1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
if (f > 1.48){
a = 0
a <- a + 1}
if (f < .67){
b = 0
b <- b + 1}
}
a
[1] 1
b
[1] 1
The end goal is find the sum of a and b
I have also tried:
for (k in 1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
a = f > 1.48
b = f < .67
}
y = sum(a)+sum(b)
y
[1] 0
What other way would I increment to get the total amount of f's that are in the reject region?

In your first example, you are reset-ing a and b to zero every time that the if statement is true. Therefore, the max value will always be 1.
To fix, rearrange those lines:
a = 0 #initialize outside of the loop
b = 0 #initialize outside of the loop
set.seed(1) # added for SO as you are using rnorm, remove this when you run your simulations
for (k in 1:1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
if (f > 1.48){
a <- a + 1}
if (f < .67){
b <- b + 1}
}
I now get a = 13 and b = 29
That said, don't increment variables like this in R. You can take advantage of matrices and vectorized operations.
First Create simulation matrices
set.seed(1)
Adata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100)
Bdata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100)
Then calculate your f score for each line:
f <- apply(Bdata,1,function(x){sd(x)^2})/apply(Adata,1,function(x){sd(x)^2})
now you can simply use:
sum(f > 1.48)
[1] 15
and:
sum(f < .67)
[1] 25

In the first block of code you are resetting a and b to 0 every iteration, then possibly adding 1 (so the most they will ever be is 1 because next iteration they will be set to 0 again).
In the second block you are setting a and b to either TRUE or FALSE, but you are overwriting the value, so you only see the value from the final iteration (actually that loop only runs once with k equal to 1000, but if you had 1:1000 there then you would only see the last iteration).
The simple solution is to move the a=0 and b=0 (or better a <- 0 and b <- 0) outside of the loop.
The better approach is to use something in the apply family of functions.
I would suggest something like:
out <- replicate(1000, {
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
(sd(Bdata)^2)/(sd(Adata)^2)
})
sum( out > 1.48 )
sum( out < 0.67 )
sum( out > 1.48 | out < 0.67 )

Related

How can I create a new vector command within a loop in R?

for (i in 1:100) {
e <- rnorm(n = 20, mean = 100, sd = 10)
}
e
So I want to know how I can command (something) within each of the 20 randomly generated vectors. E.g. how can I tell it so it spits me out a new command for each new random vector?
for (i in 1:100) {
e <- rnorm(n = 20, mean = 100, sd = 10)
new_vector <- mean(e) - median(e)
}
e
I have tried this but that's definitely not it.
With the OP's code, we may need to initialize e and concatenate the object to append in each iteration
e <- numeric(0)
for (i in 1:100) {
e <- c(e, rnorm(n = 20, mean = 100, sd = 10))
}
If we want to create a list
e <- vector('list', 100)
for(i in 1:100) {
e[[i]] <- rnorm(n = 20, mean = 100, sd = 10)
}
Or if the interest is to get a vector with the difference of mean and median, initialize new_vector of length 100, loop over the sequence (1:100), get the random numbers in 'e' and assign the difference of mean, median for each position of the 'new_vector' using the the sequence as index
new_vector <- numeric(100)
for(i in 1:100){
e <- rnorm(n = 20, mean = 100, sd = 10)
new_vector[i] <- mean(e) - median(e)
}
Or using lapply/sapply/replicate
lapply(replicate(100, rnorm(n = 20, mean = 100, sd = 10),
simplify = FALSE), function(x) mean(x) - median(x))
Or with vectorized functions - colMedians (from matrixStats) and colMeans
library(matrixStats)
m1 <- replicate(100, rnorm(n = 20, mean = 100, sd = 10))
colMedians(m1)- colMeans(m1)

Fitting data frame probability distributions with different lengths - EnvStat - looping in R

I'm trying to fit probability distributions in R using EnvStat package and looping to calculate multiple columns at once.
Columns have different lengths and some code error is happening. The data frame does not remain in numeric format.
Error message: 'x' must be a numeric vector
I couldn't identify the error. Could anyone help?
Many thanks
Follow code:
x = runif(n = 50, min = 1, max = 12)
y = runif(n = 70, min = 5, max = 15)
z = runif(n = 35, min = 1, max = 10)
m = runif(n = 80, min = 6, max = 18)
length(x) = length(m)
length(y) = length(m)
length(z) = length(m)
df = data.frame(x=x,y=y,z=z,m=m)
df
library(EnvStats)
nproc = 4
cont = 1
dfr = data.frame(variavel = character(nproc),
locationevd= (nproc), scaleevd= (nproc),
stringsAsFactors = F)
# i = 2
for (i in 1:4) {
print(i)
nome.var=colnames(df)
df = df[,c(i)]
df = na.omit(df)
variavela = nome.var[i]
dfr$variavel[cont] = variavela
evd = eevd(df);evd
locationevd = evd$parameters[[1]]
dfr$locationevd[cont] = locationevd
scaleevd = evd$parameters[[2]]
dfr$scaleevd[cont] = scaleevd
cont = cont + 1
}
writexl::write_xlsx(dfr, path = "Results.xls")
Two major changes to you code:
First, use a list instead of a dataframe (so you can accommodate unequal vector lengths):
x = runif(n = 50, min = 1, max = 12)
y = runif(n = 70, min = 5, max = 15)
z = runif(n = 35, min = 1, max = 10)
m = runif(n = 80, min = 6, max = 18)
vl = list(x=x,y=y,z=z,m=m)
vl
if (!require(EnvStats){ install.packages('EnvStats'); library(EnvStats)}
nproc = 4
# cont = 1 Not used
dfr = data.frame(variavel = character(nproc),
locationevd= (nproc), scaleevd= (nproc),
stringsAsFactors = F)
Second: Use one loop index and not use "cont" index
for ( i in 1:length(vl) ) {
# print(i) Not needed
nome.var=names(vl) # probably should have been done before loop
var = vl[[i]]
variavela = nome.var[i]
dfr$variavel[i] = variavela # all those could have been one step
evd = eevd( vl[[i]] ) # ;evd
locationevd = evd$parameters[[1]]
dfr$locationevd[i] = locationevd
scaleevd = evd$parameters[[2]]
dfr$scaleevd[i] = scaleevd
}
Which gets you the desired structure:
dfr
variavel locationevd scaleevd
1 x 5.469831 2.861025
2 y 7.931819 2.506236
3 z 3.519528 2.040744
4 m 10.591660 3.223352

How execute pairwise.t.test into a list with `for` loop?

My list (lt):
df_1 <- data.frame(
x = replicate(
n = 2,
expr = runif(n = 30, min = 20, max = 100)
),
y = sample(
x = 1:3, size = 30, replace = TRUE
)
)
lt <- split(
x = df_1,
f = df_1[['y']]
)
vars <- names(df_1)[1:2]
I try:
for (i in vars) {
for (i in i) {
print(pairwise.t.test(x = lt[, i], g = lt[['y']], p.adj = 'bonferroni'))
}
}
But, the error message is:
Error in lista[, i] : incorrect number of dimensions
What's problem?
We don't need to split
pairwise.t.test(unlist(df_1[1:2]), g = rep(df_1$y, 2), p.adj = 'bonferroni')
#Pairwise comparisons using t tests with pooled SD
#data: unlist(df_1[1:2]) and rep(df_1$y, 2)
# 1 2
#2 1.00 -
#3 0.91 1.00

Step change in input parameter with time in R

If anyone can help me how to incorporate step in input parameter with respect to time. Please see the code below:
library(ReacTran)
N <- 10 # No of grids
L = 0.10 # thickness, m
l = L/2 # Half of thickness, m
k= 0.412 # thermal conductivity, W/m-K
cp = 3530 # thermal conductivity, J/kg-K
rho = 1100 # density, kg/m3
T_int = 57.2 # Initial temperature , degC
T_air = 19 # air temperature, degC
h_air = 20 # Convective heat transfer coeff of air, W/m2-K
xgrid <- setup.grid.1D(x.up = 0, x.down = l, N = N)
x <- xgrid$x.mid
alpha.coeff <- (k*3600)/(rho*cp)
Diffusion <- function (t, Y, parms){
tran <- tran.1D(C=Y, flux.down = 0, C.up = T_air, a.bl.up = h_air,
D = alpha.coeff, dx = xgrid)
list(dY = tran$dC, flux.up = tran$flux.up,
flux.down = tran$flux.down)
}
# Initial condition
Yini <- rep(T_int, N)
times <- seq(from = 0, to = 2, by = 0.2)
print(system.time(
out <- ode.1D(y = Yini, times = times, func = Diffusion,
parms = NULL, dimens = N)))
plot(times, out[,(N+1)], type = "l", lwd = 2, xlab = "time, hr", ylab = "Temperature")
I want the T_air to be constant for the 1st hour and it changes to another value for remaining 1 hr. This would be a step changein the parameter. How can I do it?
Any help would be appreciated.
Thanks,

Generating random numbers in two vectors in R given a specified condition

I want to create two vectors in R that contain values randomly drawn from a uniform distribution given a specified condition, that is for example if the number in vector A is < 50 then the number in vector B should be greater than 50.
I use this code but it is applied only on the first element of the vectors
nrows = 20
A = NaN*matrix(1, nrows, 1)
B = NaN*matrix(1, nrows, 1)
repeat {
A[] = round(runif(nrows, 10, 100), digits =2)
B[] = round(runif(nrows, 10, 100), digits =2)
if(A > 50 & B > 50) {
break
}
}
This should work for you if i understood the problem correctly:
nrows = 20
A = NaN * matrix(1, nrows, 1)
B = NaN * matrix(1, nrows, 1)
for (i in 1:nrows) {
A[i] <- round(runif(1, 10, 100), digits = 2)
if (A[i] < 50) {
B[i] <- round(runif(1, 50, 100), digits = 2)
} else {
B[i] <- round(runif(1, 10, 100), digits = 2)
}
}

Resources