Simplifying R code when fitting data (for loops) - r
I have working code to accomplish my goal, but as you will see it is not very elegant. I've tried writing it with for loops but my coding knowledge is relatively basic. Could some generous person help me simplify my code and hopefully annotate it so I can learn how to do it myself? I know this is a big ask but I appreciate any help, thanks!
The end goal is to produce a table, to export, that contains A0 and kobs values for each fit (along with their respective Std. errors).
The kobs values are then passed to another equation to obtain three additional parameters. In the provided data the [100] data set was excluded from the final analysis (the exclusion is justified by external factors). I am new here so if I am formatting something wrong or violating some standards please let me know and I will fix it.
Data (copied from a .csv file):
Time,[0]1,[0]2,[0]3,[1],[2.5],[6],[16],[40],[100]
0,1.008,,0.963,1.038,0.979,0.973,0.973,0.906,0.979
0,0.992,1.000,1.037,0.962,1.021,1.027,1.027,1.094,1.021
5,0.813,0.968,0.961,0.704,0.667,0.470,,,
5,0.861,0.971,0.913,0.713,0.645,0.512,0.353,0.306,0.351
10,0.820,0.868,0.888,0.613,0.407,0.262,,0.158,0.236
10,0.851,0.857,0.890,0.563,0.444,0.250,0.197,0.169,0.275
15,0.778,0.773,0.798,,,,0.154,0.145,0.204
15,0.778,0.752,0.894,0.552,0.308,0.184,0.109,0.146,0.238
20,0.610,0.727,0.806,0.441,0.247,0.180,0.114,0.143,0.269
20,0.747,0.784,0.806,0.426,0.257,0.176,0.138,0.116,0.345
30,,,,0.321,0.164,0.091,0.127,0.100,0.279
30,0.563,0.642,0.633,0.268,0.146,0.082,0.096,0.096,0.275
40,0.556,0.524,0.529,0.286,0.106,0.082,0.037,0.079,0.249
40,0.581,0.485,0.487,0.266,0.119,0.045,,,0.041
## Assign Time (x) data
t <- D1$Time
## Assign Response (y) data
R1 <- D1$`[0]1`
R2 <- D1$`[0]2`
R3 <- D1$`[0]3`
R4 <- D1$`[1]`
R5 <- D1$`[2.5]`
R6 <- D1$`[6]`
R7 <- D1$`[16]`
R8 <- D1$`[40]`
R9 <- D1$`[100]`
## Fit data
F1 <- nls(R1 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P1 <- summary(F1)$parameters[,1:2]
F2 <- nls(R2 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P2 <- summary(F2)$parameters[,1:2]
F3 <- nls(R3 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P3 <- summary(F3)$parameters[,1:2]
F4 <- nls(R4 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P4 <- summary(F4)$parameters[,1:2]
F5 <- nls(R5 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P5 <- summary(F5)$parameters[,1:2]
F6 <- nls(R6 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P6 <- summary(F6)$parameters[,1:2]
F7 <- nls(R7 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P7 <- summary(F7)$parameters[,1:2]
F8 <- nls(R8 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P8 <- summary(F8)$parameters[,1:2]
F9 <- nls(R9 ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P9 <- summary(F9)$parameters[,1:2]
## Assemble Table
SS <- c(colnames(D1)[2],colnames(D1)[3],colnames(D1)[4],colnames(D1)[5],colnames(D1)[6],colnames(D1)[7],colnames(D1)[8],colnames(D1)[9],colnames(D1)[10])
A0 <- c(P1[1,1],P2[1,1],P3[1,1],P4[1,1],P5[1,1],P6[1,1],P7[1,1],P8[1,1],P9[1,1])
SEA0 <- c(P1[1,2],P2[1,2],P3[1,2],P4[1,2],P5[1,2],P6[1,2],P7[1,2],P8[1,2],P9[1,2])
kobs <- c(P1[2,1],P2[2,1],P3[2,1],P4[2,1],P5[2,1],P6[2,1],P7[2,1],P8[2,1],P9[2,1])
SEkobs <- c(P1[2,2],P2[2,2],P3[2,2],P4[2,2],P5[2,2],P6[2,2],P7[2,2],P8[2,2],P9[2,2])
ExTab <- cbind(SS, A0, SEA0, kobs, SEkobs)
write_clip(ExTab)
conI <- c(0,0,0,0.5,1.5,4,12,35)
kobsA <- c(P1[2,1],P2[2,1],P3[2,1],P4[2,1],P5[2,1],P6[2,1],P7[2,1],P8[2,1])
kFit <- nls(kobsA ~ k0 + ((kin*conI)/(KI+conI)), start = list(k0 = 0.1, kin = 0.2, KI = 3))
summary(kFit)
My inclination is to use for loops to reduce the repetition, but I am unable to write working ones. I would also like to eliminated the ##Assign Response (y) data section so that I can apply the code to generic data sets with different concentrations in the column names but using D1[2] in place of R1 inside the nls function produces an error.
You can start like this:
t <- D1$Time
for (i in (1:ncol(D1))){
fit <- nls(D1[,i] ~ A1 * exp(-k1 * t), start = list(A1 = 1, k1 = 0.02))
P <- summary(fit)$parameters[,1:2]
A0[i] <- P[1,1]
SEA0[i] <- P[1,2]
kobs[i] <- P[2,1]
SEkobs[i] <- P[2,2]
}
ExTab <- cbind(SS, A0, SEA0, kobs, SEkobs)
write_clip(ExTab)
Related
Simulate data with R package SimCorMultRes for correlated binary data
I am simulating data with R package SimCorMultRes. My code is below for cluster size 5 and 9. Now I want to do the same thing for cluster size 100. It's unrealistic to write down p1, p2, ..., p99. I am just wondering if anybody can help me. library(SimCorMultRes) cluster_size <- 5 x <- runif(1000*cluster_size) p1 <- rep(c(1,0,0,0,0),1000) p2 <- rep(c(0,1,0,0,0),1000) p3 <- rep(c(0,0,1,0,0),1000) p4 <- rep(c(0,0,0,1,0),1000) beta_intercepts <- -3 beta_coefficients <- c(0,rnorm(4)) latent_correlation_matrix <- toeplitz(c(1, 0.2, 0.2, 0.2, 0.2)) # simulation of clustered binary responses simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts, betas = beta_coefficients, xformula = ~x+p1+p2+p3+p4, cor.matrix = latent_correlation_matrix, link = "logit") library(SimCorMultRes) cluster_size <- 9 x <- runif(1000*cluster_size) p1 <- rep(c(1,rep(0,8)),1000) p2 <- rep(c(0,1,rep(0,7)),1000) p3 <- rep(c(rep(0,2),1,rep(0,6)),1000) p4 <- rep(c(rep(0,3),1,rep(0,5)),1000) p5 <- rep(c(rep(0,4),1,rep(0,4)),1000) p6 <- rep(c(rep(0,5),1,rep(0,3)),1000) p7 <- rep(c(rep(0,6),1,rep(0,2)),1000) p8 <- rep(c(rep(0,7),1,rep(0,1)),1000) beta_intercepts <- -3 beta_coefficients <- c(0,rnorm(8)) latent_correlation_matrix <- toeplitz(c(1, rep(0.2, 8))) simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts, betas = beta_coefficients, xformula = ~x+p1+p2+p3+p4+p5+p6+p7+p8, cor.matrix = latent_correlation_matrix, link = "logit")
I find a way to do it library(SimCorMultRes) cluster_size <- 9 size=1000 x <- runif(size*cluster_size) p1 <- rep(c(1,rep(0,cluster_size-1)),size) for (i in 1:(cluster_size-2)) { a <- paste0("p", i+1) assign(a, rep(c(rep(0,i),1,rep(0,cluster_size-1-i)),size)) } PredictorVariables <- paste("p", 1:(cluster_size-1), sep="") Formula <- formula(paste("~ x + ", paste(PredictorVariables, collapse=" + "))) beta_intercepts <- -3 beta_coefficients <- c(0,rnorm(cluster_size-1)) latent_correlation_matrix <- toeplitz(c(1, rep(0.2, cluster_size-1))) simulated_binary_dataset <- rbin(clsize = cluster_size, intercepts = beta_intercepts, betas = beta_coefficients, xformula = Formula, cor.matrix = latent_correlation_matrix, link = "logit")
Every time variable increases, create a new sample using a for loop
I need to do the following: I would like to increase variable pi1 from -1 to 0 by 0.0001 steps under the condition that all the other variables stay the same. As a second step I need to select 1000 samples for each different pi1 value. In the end, I would need to measure the biasedness of both regressions to the real value. After a thorough investigation, I really don't see why I both loops are not working. This sort of an idea how it could work index <- seq(1, 1000) beta_OLS <- NULL beta_IV <- NULL for(i in seq(from = -1, to = 0, by = 0.001)) { for(k in index) { n <- 2000 pi1 <- i b0 <- 0 b1 <- 0 b2 <- -1/1000 b3 <- 1/5 z <- runif(n, 0, 25) ov <- rnorm(n, 0, 1) d <- -1/2 + pi1 * z + 1/2 * ov + rnorm(n, 0, 1) > 0 y <- b0 + b1 * d + b2 * z + b3 * ov + rnorm(n, 0, 1/10) #OLS Regression model12 = lm(y ~ d, data = data) beta_OLS[k] = model12$coefficients[2] #IV Regression model12_1 = ivreg(y ~ d | z, data=data) beta_IV[k] = model12_1$coefficients[2] } } real_value <- - 1/1000 average_OLS <- mean(beta_OLS) average_IV <- mean(beta_IV) biased_OLS <- average_OLS - real_value biased_IV <- average_IV - real_value biased_OLS biased_IV
Did a few alterations here and it ran. Not sure if the result are right. Since both loops had the same count, eliminated one of them. Also attributed the results of d and y on every run to a dataframe called data (that you used as source for your regressions). index = seq(1,1000) beta_OLS = NULL beta_IV = NULL i = -1 for(k in index){ n <- 2000 pi1 <- i b0 <- 0 b1 <- 0 b2 <- -1/1000 b3 <- 1/5 z <- runif(n,0,25) ov <- rnorm(n,0,1) d <- -1/2 + pi1 * z + 1/2 * ov + rnorm(n,0,1) > 0 y <- b0 + b1 * d + b2 * z + b3 * ov + rnorm(n,0,1/10) data = as.data.frame(cbind(y,d)) #OLS Regression model12 = lm(y ~ d, data = data) beta_OLS[k] = model12$coefficients[2] #IV Regression model12_1 = ivreg::ivreg(y ~ d | z, data=data) beta_IV[k] = model12_1$coefficients[2] pi1 <- i + 0.001 } real_value = - 1/1000 average_OLS = mean(beta_OLS) average_IV = mean(beta_IV) biased_OLS = average_OLS - real_value biased_IV = average_IV - real_value biased_OLS biased_IV
nls: Loop and break in a decided number of iterations
I've like two make to sequential operations: 1) Ajusted two nls models in a subset; and 2) Loop the models just a number of iteracions =1. For the first step I make: #Packages library(minpack.lm) # Data set - Diameter in function of Feature and Age Feature<-sort(rep(c("A","B"),22)) Age<-c(60,72,88,96,27, 36,48,60,72,88,96,27,36,48,60,72, 88,96,27,36,48,60,27,27,36,48,60, 72,88,96,27,36,48,60,72,88,96,27, 36,48,60,72,88,96) Diameter<-c(13.9,16.2, 19.1,19.3,4.7,6.7,9.6,11.2,13.1,15.3, 15.4,5.4,7,9.9,11.7,13.4,16.1,16.2, 5.9,8.3,12.3,14.5,2.3,5.2,6.2,8.6,9.3, 11.3,15.1,15.5,5,7,7.9,8.4,10.5,14,14, 4.1,4.9,6,6.7,7.7,8,8.2) d<-dados <- data.frame(Feature,Age,Diameter) str(d) #Create a nls model (Levenberg-Marquardt algoritm) for each Feature (A abd B) e1<- Diameter ~ a1 * Age^a2 Fecture_vec<-unique(d$Feature) mod_ND <- list() #List for save each model for(i in 1:length(Fecture_vec)){ d2 <- subset(d, d$Feature == Fecture_vec[i]) mod_ND[[i]] <- nlsLM(e1, data = d2, start = list(a1 = 0.1, a2 = 10), control = nls.control(maxiter = 1000)) print(summary(mod_ND[[i]])) } # Here, so far so good, but if I try to make a loop with 999 simulation and recycle the start values with coef(mod_ND[[i]])[1] and coef(mod_ND[[i]])[2] and stop when number of iterations is 1: e1<- Diameter ~ a1 * Age^a2 Fecture_vec<-unique(d$Feature) mod_ND <- list() #List for save each model for(i in 1:length(Fecture_vec)){ d2 <- subset(d, d$Feature == Fecture_vec[i]) mod_ND[[i]] <- nlsLM(e1, data = d2, start = list(a1 = 0.1, a2 = 10), control = nls.control(maxiter = 1000)) Xs<-data.frame() for(z in 1:999){ d2 <- subset(d, d$Feature == Fecture_vec[i]) mod_ND[[z]] <- nlsLM(e1, data = d2, start = list(a1 = coef(mod_ND[[i]])[1], a2 = mod_ND[[i]])[2]), control = nls.control(maxiter = 1000)) if (mod_ND[[z,c(finIter")]] <= 1){ break } ## Stop when iteractions =1 print(summary(mod_ND[[z]])) } } # Doesn't work!! Please any ideas?
#Packages library(minpack.lm) library(dplyr) m<-function(d, a=0.01,b=10){ mod<- nlsLM(Diameter ~ a1 * Age^a2,start = list(a1 = a, a2 = b),control = nls.control(maxiter = 1000), data = d) par1<- summary(mod)$coefficients[[1]] par2 <- summary(mod)$coefficients[[2]] print(summary(mod)) if(mod$convInfo[["finIter"]]>1){ m(d,par1,par2) }else{ print(" --------Feature B-----------") } } list_models <- dlply(d,.(Feature),m) list_models
`nlme` with crossed random effects
I am trying to fit a crossed non-linear random effect model as the linear random effect models as mentioned in this question and in this mailing list post using the nlme package. Though, I get an error regardless of what I try. Here is an example library(nlme) ##### # simulate data set.seed(18112003) na <- 30 nb <- 30 sigma_a <- 1 sigma_b <- .5 sigma_res <- .33 n <- na*nb a <- gl(na,1,n) b <- gl(nb,na,n) u <- gl(1,1,n) x <- runif(n, -3, 3) y_no_noise <- x + sin(2 * x) y <- x + sin(2 * x) + rnorm(na, sd = sigma_a)[as.integer(a)] + rnorm(nb, sd = sigma_b)[as.integer(b)] + rnorm(n, sd = sigma_res) ##### # works in the linear model where we know the true parameter fit <- lme( # somehow we found the right values y ~ x + sin(2 * x), random = list(u = pdBlocked(list(pdIdent(~ a - 1), pdIdent(~ b - 1))))) vv <- VarCorr(fit) vv2 <- vv[c("a1", "b1"), ] storage.mode(vv2) <- "numeric" print(vv2,digits=4) #R Variance StdDev #R a1 1.016 1.0082 #R b1 0.221 0.4701 ##### # now try to do the same with `nlme` fit <- nlme( y ~ c0 + sin(c1), fixed = list(c0 ~ x, c1 ~ x - 1), random = list(u = pdBlocked(list(pdIdent(~ a - 1), pdIdent(~ b - 1)))), start = c(0, 0.5, 1)) #R Error in nlme.formula(y ~ a * x + sin(b * x), fixed = list(a ~ 1, b ~ : #R 'random' must be a formula or list of formulae The lme example is similar to the one page 163-166 of "Mixed-effects Models in S and S-PLUS" with only 2 random effects instead of 3.
I should haved used a two-sided formula as written in help("nlme") fit <- nlme( y ~ c0 + c1 + sin(c2), fixed = list(c0 ~ 1, c1 ~ x - 1, c2 ~ x - 1), random = list(u = pdBlocked(list(pdIdent(c0 ~ a - 1), pdIdent(c1 ~ b - 1)))), start = c(0, 0.5, 1)) # fixed effects estimates fixef(fit) #R c0.(Intercept) c1.x c2.x #R -0.1788218 0.9956076 2.0022338 # covariance estimates vv <- VarCorr(fit) vv2 <- vv[c("c0.a1", "c1.b1"), ] storage.mode(vv2) <- "numeric" print(vv2,digits=4) #R Variance StdDev #R c0.a1 0.9884 0.9942 #R c1.b1 0.2197 0.4688
solving for steady state PDE using steady.1D (rootSolve R)
I am trying to obtain a steady state for a spatially-explicit Lotka-Volterra competition model of two competing species (with spatial diffusion). Here is the model (without diffusion term): http://en.wikipedia.org/wiki/Competitive_Lotka%E2%80%93Volterra_equations where I let r1 = r2 = rG & alpha12 = alpha 21 = a. The carrying capacity of species 1 is assumed to vary linearly across space x i.e. K1 = x (while K2 = 0.5). And we assume Neumann BC. The spatial domain x is from 0 to 1. Here is the example of coding in R for this model: LVcomp1D <- function (time, state, parms, N, Da, x, dx) { with (as.list(parms), { S1 <- state[1:N] S2 <- state[(N+1):(2*N)] ## Dispersive fluxes; zero-gradient boundaries FluxS1 <- -Da * diff(c(S1[1], S1, S1[N]))/dx FluxS2 <- -Da * diff(c(S2[1], S2, S2[N]))/dx ## LV Competition InteractS1 <- rG * S1 * (1- (S1/x)- ((a*S2)/x)) InteractS2 <- rG * S2 * (1- (S2/(K2))- ((a*S1)/(K2))) ## Rate of change = -Flux gradient + Interaction dS1 <- -diff(FluxS1)/dx + InteractS1 dS2 <- -diff(FluxS2)/dx + InteractS2 return (list(c(dS1, dS2))) }) } pars <- c(rG = 1.0, a = 0.8, K2 = 0.5) dx <- 0.001 x <- seq(0, 1, by = dx) N <- length(x) Da <- 0.001 state <- c(rep(0.5, N), rep(0.5, N)) print(system.time( out <- steady.1D (y = state, func = LVcomp1D, parms = pars, nspec = 2, N = N, x = x, dx = dx, Da = Da, pos = TRUE) )) mf <- par(mfrow = c(2, 2)) plot(out, grid = x, xlab = "x", mfrow = NULL, ylab = "N(x)", main = c("Species 1", "Species 2"), type = "l") par(mfrow = mf) The problem is I cannot get the steady state solutions of the model. I keep getting a horizontal line passing through x-axis. Can you please help me since I do not know what is wrong with this code. Thank you