Simplifying R code when fitting data (for loops) - r

I have working code to accomplish my goal, but as you will see it is not very elegant. I've tried writing it with for loops but my coding knowledge is relatively basic. Could some generous person help me simplify my code and hopefully annotate it so I can learn how to do it myself? I know this is a big ask but I appreciate any help, thanks!
The end goal is to produce a table, to export, that contains A0 and kobs values for each fit (along with their respective Std. errors).
The kobs values are then passed to another equation to obtain three additional parameters. In the provided data the [100] data set was excluded from the final analysis (the exclusion is justified by external factors). I am new here so if I am formatting something wrong or violating some standards please let me know and I will fix it.
Data (copied from a .csv file):
Time,[0]1,[0]2,[0]3,[1],[2.5],[6],[16],[40],[100]
0,1.008,,0.963,1.038,0.979,0.973,0.973,0.906,0.979
0,0.992,1.000,1.037,0.962,1.021,1.027,1.027,1.094,1.021
5,0.813,0.968,0.961,0.704,0.667,0.470,,,
5,0.861,0.971,0.913,0.713,0.645,0.512,0.353,0.306,0.351
10,0.820,0.868,0.888,0.613,0.407,0.262,,0.158,0.236
10,0.851,0.857,0.890,0.563,0.444,0.250,0.197,0.169,0.275
15,0.778,0.773,0.798,,,,0.154,0.145,0.204
15,0.778,0.752,0.894,0.552,0.308,0.184,0.109,0.146,0.238
20,0.610,0.727,0.806,0.441,0.247,0.180,0.114,0.143,0.269
20,0.747,0.784,0.806,0.426,0.257,0.176,0.138,0.116,0.345
30,,,,0.321,0.164,0.091,0.127,0.100,0.279
30,0.563,0.642,0.633,0.268,0.146,0.082,0.096,0.096,0.275
40,0.556,0.524,0.529,0.286,0.106,0.082,0.037,0.079,0.249
40,0.581,0.485,0.487,0.266,0.119,0.045,,,0.041
## Assign Time (x) data
t <- D1$Time
## Assign Response (y) data
R1 <- D1$`[0]1`
R2 <- D1$`[0]2`
R3 <- D1$`[0]3`
R4 <- D1$`[1]`
R5 <- D1$`[2.5]`
R6 <- D1$`[6]`
R7 <- D1$`[16]`
R8 <- D1$`[40]`
R9 <- D1$`[100]`
## Fit data
F1 <- nls(R1 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P1 <- summary(F1)$parameters[,1:2]
F2 <- nls(R2 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P2 <- summary(F2)$parameters[,1:2]
F3 <- nls(R3 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P3 <- summary(F3)$parameters[,1:2]
F4 <- nls(R4 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P4 <- summary(F4)$parameters[,1:2]
F5 <- nls(R5 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P5 <- summary(F5)$parameters[,1:2]
F6 <- nls(R6 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P6 <- summary(F6)$parameters[,1:2]
F7 <- nls(R7 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P7 <- summary(F7)$parameters[,1:2]
F8 <- nls(R8 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P8 <- summary(F8)$parameters[,1:2]
F9 <- nls(R9 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P9 <- summary(F9)$parameters[,1:2]
## Assemble Table
SS <- c(colnames(D1)[2],colnames(D1)[3],colnames(D1)[4],colnames(D1)[5],colnames(D1)[6],colnames(D1)[7],colnames(D1)[8],colnames(D1)[9],colnames(D1)[10])
A0 <- c(P1[1,1],P2[1,1],P3[1,1],P4[1,1],P5[1,1],P6[1,1],P7[1,1],P8[1,1],P9[1,1])
SEA0 <- c(P1[1,2],P2[1,2],P3[1,2],P4[1,2],P5[1,2],P6[1,2],P7[1,2],P8[1,2],P9[1,2])
kobs <- c(P1[2,1],P2[2,1],P3[2,1],P4[2,1],P5[2,1],P6[2,1],P7[2,1],P8[2,1],P9[2,1])
SEkobs <- c(P1[2,2],P2[2,2],P3[2,2],P4[2,2],P5[2,2],P6[2,2],P7[2,2],P8[2,2],P9[2,2])
ExTab <- cbind(SS, A0, SEA0, kobs, SEkobs)
write_clip(ExTab)
conI <- c(0,0,0,0.5,1.5,4,12,35)
kobsA <- c(P1[2,1],P2[2,1],P3[2,1],P4[2,1],P5[2,1],P6[2,1],P7[2,1],P8[2,1])
kFit <- nls(kobsA ~ k0 + ((kin*conI)/(KI+conI)), start = list(k0 = 0.1, kin = 0.2, KI = 3))
summary(kFit)
My inclination is to use for loops to reduce the repetition, but I am unable to write working ones. I would also like to eliminated the ##Assign Response (y) data section so that I can apply the code to generic data sets with different concentrations in the column names but using D1[2] in place of R1 inside the nls function produces an error.

You can start like this:
t <- D1$Time
for (i in (1:ncol(D1))){
fit <- nls(D1[,i] ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P <- summary(fit)$parameters[,1:2]
A0[i] <- P[1,1]
SEA0[i] <- P[1,2]
kobs[i] <- P[2,1]
SEkobs[i] <- P[2,2]
}
ExTab <- cbind(SS, A0, SEA0, kobs, SEkobs)
write_clip(ExTab)

Related

Simulate data with R package SimCorMultRes for correlated binary data

I am simulating data with R package SimCorMultRes. My code is below for cluster size 5 and 9. Now I want to do the same thing for cluster size 100. It's unrealistic to write down p1, p2, ..., p99. I am just wondering if anybody can help me.
library(SimCorMultRes)
cluster_size <- 5
x <- runif(1000*cluster_size)
p1 <- rep(c(1,0,0,0,0),1000)
p2 <- rep(c(0,1,0,0,0),1000)
p3 <- rep(c(0,0,1,0,0),1000)
p4 <- rep(c(0,0,0,1,0),1000)
beta_intercepts <- -3
beta_coefficients <- c(0,rnorm(4))
latent_correlation_matrix <- toeplitz(c(1, 0.2, 0.2, 0.2, 0.2))
# simulation of clustered binary responses
simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts,
betas = beta_coefficients, xformula = ~x+p1+p2+p3+p4, cor.matrix = latent_correlation_matrix,
link = "logit")
library(SimCorMultRes)
cluster_size <- 9
x <- runif(1000*cluster_size)
p1 <- rep(c(1,rep(0,8)),1000)
p2 <- rep(c(0,1,rep(0,7)),1000)
p3 <- rep(c(rep(0,2),1,rep(0,6)),1000)
p4 <- rep(c(rep(0,3),1,rep(0,5)),1000)
p5 <- rep(c(rep(0,4),1,rep(0,4)),1000)
p6 <- rep(c(rep(0,5),1,rep(0,3)),1000)
p7 <- rep(c(rep(0,6),1,rep(0,2)),1000)
p8 <- rep(c(rep(0,7),1,rep(0,1)),1000)
beta_intercepts <- -3
beta_coefficients <- c(0,rnorm(8))
latent_correlation_matrix <- toeplitz(c(1, rep(0.2, 8)))
simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts,
betas = beta_coefficients, xformula = ~x+p1+p2+p3+p4+p5+p6+p7+p8, cor.matrix = latent_correlation_matrix,
link = "logit")
I find a way to do it
library(SimCorMultRes)
cluster_size <- 9
size=1000
x <- runif(size*cluster_size)
p1 <- rep(c(1,rep(0,cluster_size-1)),size)
for (i in 1:(cluster_size-2)) {
a <- paste0("p", i+1)
assign(a, rep(c(rep(0,i),1,rep(0,cluster_size-1-i)),size))
}
PredictorVariables <- paste("p", 1:(cluster_size-1), sep="")
Formula <- formula(paste("~ x + ",
paste(PredictorVariables, collapse=" + ")))
beta_intercepts <- -3
beta_coefficients <- c(0,rnorm(cluster_size-1))
latent_correlation_matrix <- toeplitz(c(1, rep(0.2, cluster_size-1)))
simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts,
betas = beta_coefficients, xformula = Formula, cor.matrix = latent_correlation_matrix,
link = "logit")

Every time variable increases, create a new sample using a for loop

I need to do the following: I would like to increase variable pi1 from -1 to 0 by 0.0001 steps under the condition that all the other variables stay the same. As a second step I need to select 1000 samples for each different pi1 value. In the end, I would need to measure the biasedness of both regressions to the real value. After a thorough investigation, I really don't see why I both loops are not working.
This sort of an idea how it could work
index <- seq(1, 1000)
beta_OLS <- NULL
beta_IV <- NULL
for(i in seq(from = -1, to = 0, by = 0.001)) {
for(k in index) {
n <- 2000
pi1 <- i
b0 <- 0
b1 <- 0
b2 <- -1/1000
b3 <- 1/5
z <- runif(n, 0, 25)
ov <- rnorm(n, 0, 1)
d <- -1/2 + pi1 * z + 1/2 * ov + rnorm(n, 0, 1) > 0
y <- b0 + b1 * d + b2 * z + b3 * ov + rnorm(n, 0, 1/10)
#OLS Regression
model12 = lm(y ~ d, data = data)
beta_OLS[k] = model12$coefficients[2]
#IV Regression
model12_1 = ivreg(y ~ d | z, data=data)
beta_IV[k] = model12_1$coefficients[2]
}
}
real_value <- - 1/1000
average_OLS <- mean(beta_OLS)
average_IV <- mean(beta_IV)
biased_OLS <- average_OLS - real_value
biased_IV <- average_IV - real_value
biased_OLS
biased_IV
Did a few alterations here and it ran. Not sure if the result are right.
Since both loops had the same count, eliminated one of them.
Also attributed the results of d and y on every run to a dataframe called data (that you used as source for your regressions).
index = seq(1,1000)
beta_OLS = NULL
beta_IV = NULL
i = -1
for(k in index){
n <- 2000
pi1 <- i
b0 <- 0
b1 <- 0
b2 <- -1/1000
b3 <- 1/5
z <- runif(n,0,25)
ov <- rnorm(n,0,1)
d <- -1/2 + pi1 * z + 1/2 * ov + rnorm(n,0,1) > 0
y <- b0 + b1 * d + b2 * z + b3 * ov + rnorm(n,0,1/10)
data = as.data.frame(cbind(y,d))
#OLS Regression
model12 = lm(y ~ d, data = data)
beta_OLS[k] = model12$coefficients[2]
#IV Regression
model12_1 = ivreg::ivreg(y ~ d | z, data=data)
beta_IV[k] = model12_1$coefficients[2]
pi1 <- i + 0.001
}
real_value = - 1/1000
average_OLS = mean(beta_OLS)
average_IV = mean(beta_IV)
biased_OLS = average_OLS - real_value
biased_IV = average_IV - real_value
biased_OLS
biased_IV

nls: Loop and break in a decided number of iterations

I've like two make to sequential operations:
1) Ajusted two nls models in a subset; and
2) Loop the models just a number of iteracions =1.
For the first step I make:
#Packages
library(minpack.lm)
# Data set - Diameter in function of Feature and Age
Feature<-sort(rep(c("A","B"),22))
Age<-c(60,72,88,96,27,
36,48,60,72,88,96,27,36,48,60,72,
88,96,27,36,48,60,27,27,36,48,60,
72,88,96,27,36,48,60,72,88,96,27,
36,48,60,72,88,96)
Diameter<-c(13.9,16.2,
19.1,19.3,4.7,6.7,9.6,11.2,13.1,15.3,
15.4,5.4,7,9.9,11.7,13.4,16.1,16.2,
5.9,8.3,12.3,14.5,2.3,5.2,6.2,8.6,9.3,
11.3,15.1,15.5,5,7,7.9,8.4,10.5,14,14,
4.1,4.9,6,6.7,7.7,8,8.2)
d<-dados <- data.frame(Feature,Age,Diameter)
str(d)
#Create a nls model (Levenberg-Marquardt algoritm) for each Feature (A abd B)
e1<- Diameter ~ a1 * Age^a2
Fecture_vec<-unique(d$Feature)
mod_ND <- list() #List for save each model
for(i in 1:length(Fecture_vec)){
d2 <- subset(d, d$Feature == Fecture_vec[i])
mod_ND[[i]] <- nlsLM(e1, data = d2,
start = list(a1 = 0.1, a2 = 10),
control = nls.control(maxiter = 1000))
print(summary(mod_ND[[i]]))
}
#
Here, so far so good, but if I try to make a loop with 999 simulation and recycle the start values with coef(mod_ND[[i]])[1] and coef(mod_ND[[i]])[2] and stop when number of iterations is 1:
e1<- Diameter ~ a1 * Age^a2
Fecture_vec<-unique(d$Feature)
mod_ND <- list() #List for save each model
for(i in 1:length(Fecture_vec)){
d2 <- subset(d, d$Feature == Fecture_vec[i])
mod_ND[[i]] <- nlsLM(e1, data = d2,
start = list(a1 = 0.1, a2 = 10),
control = nls.control(maxiter = 1000))
Xs<-data.frame()
for(z in 1:999){
d2 <- subset(d, d$Feature == Fecture_vec[i])
mod_ND[[z]] <- nlsLM(e1, data = d2,
start = list(a1 = coef(mod_ND[[i]])[1], a2 = mod_ND[[i]])[2]),
control = nls.control(maxiter = 1000))
if (mod_ND[[z,c(finIter")]] <= 1){ break } ## Stop when iteractions =1
print(summary(mod_ND[[z]]))
}
}
#
Doesn't work!! Please any ideas?
#Packages
library(minpack.lm)
library(dplyr)
m<-function(d, a=0.01,b=10){
mod<- nlsLM(Diameter ~ a1 * Age^a2,start = list(a1 = a, a2 = b),control = nls.control(maxiter = 1000), data = d)
par1<- summary(mod)$coefficients[[1]]
par2 <- summary(mod)$coefficients[[2]]
print(summary(mod))
if(mod$convInfo[["finIter"]]>1){
m(d,par1,par2)
}else{
print(" --------Feature B-----------")
}
}
list_models <- dlply(d,.(Feature),m)
list_models

`nlme` with crossed random effects

I am trying to fit a crossed non-linear random effect model as the linear random effect models as mentioned in this question and in this mailing list post using the nlme package. Though, I get an error regardless of what I try. Here is an example
library(nlme)
#####
# simulate data
set.seed(18112003)
na <- 30
nb <- 30
sigma_a <- 1
sigma_b <- .5
sigma_res <- .33
n <- na*nb
a <- gl(na,1,n)
b <- gl(nb,na,n)
u <- gl(1,1,n)
x <- runif(n, -3, 3)
y_no_noise <- x + sin(2 * x)
y <-
x + sin(2 * x) +
rnorm(na, sd = sigma_a)[as.integer(a)] +
rnorm(nb, sd = sigma_b)[as.integer(b)] +
rnorm(n, sd = sigma_res)
#####
# works in the linear model where we know the true parameter
fit <- lme(
# somehow we found the right values
y ~ x + sin(2 * x),
random = list(u = pdBlocked(list(pdIdent(~ a - 1), pdIdent(~ b - 1)))))
vv <- VarCorr(fit)
vv2 <- vv[c("a1", "b1"), ]
storage.mode(vv2) <- "numeric"
print(vv2,digits=4)
#R Variance StdDev
#R a1 1.016 1.0082
#R b1 0.221 0.4701
#####
# now try to do the same with `nlme`
fit <- nlme(
y ~ c0 + sin(c1),
fixed = list(c0 ~ x, c1 ~ x - 1),
random = list(u = pdBlocked(list(pdIdent(~ a - 1), pdIdent(~ b - 1)))),
start = c(0, 0.5, 1))
#R Error in nlme.formula(y ~ a * x + sin(b * x), fixed = list(a ~ 1, b ~ :
#R 'random' must be a formula or list of formulae
The lme example is similar to the one page 163-166 of "Mixed-effects Models in S and S-PLUS" with only 2 random effects instead of 3.
I should haved used a two-sided formula as written in help("nlme")
fit <- nlme(
y ~ c0 + c1 + sin(c2),
fixed = list(c0 ~ 1, c1 ~ x - 1, c2 ~ x - 1),
random = list(u = pdBlocked(list(pdIdent(c0 ~ a - 1), pdIdent(c1 ~ b - 1)))),
start = c(0, 0.5, 1))
# fixed effects estimates
fixef(fit)
#R c0.(Intercept) c1.x c2.x
#R -0.1788218 0.9956076 2.0022338
# covariance estimates
vv <- VarCorr(fit)
vv2 <- vv[c("c0.a1", "c1.b1"), ]
storage.mode(vv2) <- "numeric"
print(vv2,digits=4)
#R Variance StdDev
#R c0.a1 0.9884 0.9942
#R c1.b1 0.2197 0.4688

solving for steady state PDE using steady.1D (rootSolve R)

I am trying to obtain a steady state for a spatially-explicit Lotka-Volterra competition model of two competing species (with spatial diffusion). Here is the model (without diffusion term):
http://en.wikipedia.org/wiki/Competitive_Lotka%E2%80%93Volterra_equations
where I let r1 = r2 = rG & alpha12 = alpha 21 = a. The carrying capacity of species 1 is assumed to vary linearly across space x i.e. K1 = x (while K2 = 0.5). And we assume Neumann BC. The spatial domain x is from 0 to 1.
Here is the example of coding in R for this model:
LVcomp1D <- function (time, state, parms, N, Da, x, dx) {
with (as.list(parms), {
S1 <- state[1:N]
S2 <- state[(N+1):(2*N)]
## Dispersive fluxes; zero-gradient boundaries
FluxS1 <- -Da * diff(c(S1[1], S1, S1[N]))/dx
FluxS2 <- -Da * diff(c(S2[1], S2, S2[N]))/dx
## LV Competition
InteractS1 <- rG * S1 * (1- (S1/x)- ((a*S2)/x))
InteractS2 <- rG * S2 * (1- (S2/(K2))- ((a*S1)/(K2)))
## Rate of change = -Flux gradient + Interaction
dS1 <- -diff(FluxS1)/dx + InteractS1
dS2 <- -diff(FluxS2)/dx + InteractS2
return (list(c(dS1, dS2)))
})
}
pars <- c(rG = 1.0, a = 0.8, K2 = 0.5)
dx <- 0.001
x <- seq(0, 1, by = dx)
N <- length(x)
Da <- 0.001
state <- c(rep(0.5, N), rep(0.5, N))
print(system.time(
out <- steady.1D (y = state, func = LVcomp1D, parms = pars,
nspec = 2, N = N, x = x, dx = dx, Da = Da, pos = TRUE)
))
mf <- par(mfrow = c(2, 2))
plot(out, grid = x, xlab = "x", mfrow = NULL,
ylab = "N(x)", main = c("Species 1", "Species 2"), type = "l")
par(mfrow = mf)
The problem is I cannot get the steady state solutions of the model. I keep getting a horizontal line passing through x-axis. Can you please help me since I do not know what is wrong with this code.
Thank you

Resources