I apologize in advance for the elementary question, but thought it may be a quick pointer from someone out there.
I am looking at this publicly-available code and wondering why it runs slow (or stalls completely), when the mu for the negative binomial generation is >1. Is it related to the nested loops?
Thank you.
for(i in 1:runs) {
cases <- seed
t <- rep(0,seed)
times <- t
while(cases > 0) {
secondary <- rnbinom(cases,size=fit.cases$estimate[1],mu=fit.cases$estimate[2])
t.new <- numeric()
for(j in 1:length(secondary)) {
t.new <- c(t.new,t[j] + rgamma(secondary[j],shape=fit.serial$estimate[1],
rate=fit.serial$estimate[2]))
}
cases <- length(t.new)
t <- t.new
times <- c(times,t.new)
}
lines(sort(times),1:length(times),col=cols[i],lwd=1)
points(max(times),length(times),col=cols[i],pch=16)
}
https://github.com/calthaus/Ebola/blob/master/Superspreading%20(Lancet%20Inf%20Dis%202015)/Ebola_superspreading_analysis.R
Related
I want to create a model where I duplicate a sentence several times, introducing random error each time. The duplicates of the sentence also get duplicated. So, in cycle one, I start with just "example_sentence". In cycle two, I have two copies of that sentence. In cycle three, I have 4 copies of that sentence. I want to do this for 25 cycles with 20k sentences. The code I wrote to do that works way too slowly, and I am wondering if there is a way to make my nested for loops more efficient? Here is the part of the code that is the slowest:
alphabet <- c("a","b","d","j")
modr1 <- "sentencetoduplicate"
errorRate <- c()
errorRate <- append(errorRate, rep(1,1))
errorRate <- append(errorRate, rep(0,999))
duplicate <- c(modr1)
for (q in 1:25) {
collect <- c()
for (z in seq_along(duplicate)) {
modr1 = duplicate[z]
compile1 <- c()
for (k in 1:nchar(modr1)) {
error <- sample(errorRate, 1, replace = TRUE)
if (error == 1) {
compile1 <- append(compile1, sub(substring(modr1,k,k),sample(alphabet,1,replace=TRUE),substring(modr1,k,k)))
} else {
compile1 <- append(compile1, substring(modr1,k,k))
}
}
modr1 <- paste(compile1, collapse='')
collect <- append(collect, modr1)
}
duplicate <- append(duplicate, collect)
}
Here is a faster approach to your loop, but I think the problem of applying this to your problem of 20K sentences remains!
f <- function(let, alphabet = c("a","b","c","d","j"),error_rate=1/1000) {
lenlet=length(let)
let = unlist(let)
k <- rbinom(length(let),1,prob = error_rate)
let[k==1] <- sample(alphabet,size = sum(k==1), replace=T)
return(as.list(as.data.frame(matrix(let, ncol=lenlet))))
}
modr1 <- "sentencetoduplicate"
k <- data.table(list(strsplit(modr1,"")[[1]]))
for(q in 1:25) {
k[, V1:=list(f(V1))]
k <- k[rep(1:nrow(k),2)]
}
Updated with slightly faster version! (Notice this is no longer by=1:nrow(k))
I'm trying to code online PCA in R, there is no existing implementation of this code available, thus, it may be useful for others as well. The pseudo-code can be found here (Algorithm 1). What I've done so far is as follows:
PCA<-function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
lamb<-eigen(C)$values[1]
u<-eigen(C)$vectors[,1]
U<-cbind(U,u)
#U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}
To test the code I used the famous fisher iris data:
log.ir <- log(iris[, 1:4])
ir.species <- iris[, 5]
ir.pca <- PCA(log.ir,50,0.2)
There seems to be a bug in the code, which is not so obvious to me, the while loop never stops, can some one please help?
It's because while(norm(n,"2") >= 2*(X_f^2)/l) never finishes, 2*(X_f^2)/l) is always smaller than norm(n,"2")
In fact if you print out the values of these, and debug(PCA) you'll see that they never change
function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
print(norm(n,"2") )
print(2*(X_f^2)/l)
lamb<-eigen(C)$values[1]
u<-eigen(C)$vectors[,1]
U<-cbind(U,u)
U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}
debug(PCA)
In general using print statements inside of functions you want to debug is a good way to diagnose problems.
Im sorry to say that I have a problem with a for loop, again. I'm trying to save the final number from a population estimate for loop into a new matrix but I am only able to get the population estimate to show up in row 100. I know it relates to breedingPop2 but I cant figure it out. Any help would be much appreciated. Please find the code below:
finalPop=matrix(nrow=102, ncol=1)
for(i in 1:100){
SWWAyears=data.frame(iteration=rep(NA,101),pop=NA)
breedingPop<-90000
fallMig<-.825
springMig<-.825
winterSurvival<-rbeta(100,.95,.05)
npFecund<-rbinom(100, 3.0, .9)
pFecund<-rbeta(100, .85,.25)
breedingSurvival<-rbeta(100,.95,.05)
# Set initial starting condition
SWWAyears[1,2]=breedingPop
for(years in 2:101) {
fallPop<-(SWWAyears[years-1,2]*fallMig)
for (i in 1:100){
winterPop<-(fallPop*winterSurvival[i])}
springPop<-(winterPop*springMig)
for (i in 1:100){
summerPop<-(springPop*breedingSurvival[i])
}
for(i in 1:100){
breedingPop2<-((summerPop*.26)*npFecund[i])+((summerPop*.14)*pFecund[i])+(summerPop*.60)
}
SWWAyears[years,1]=years
SWWAyears[years,2]<-breedingPop2
}
finalPop[i,1]<-breedingPop2
}
I think you have more fundamental issues with your looping structure and you're not getting the correct results you're expecting. However, the reason for your specific question about only the 100th row being updated is:
Your variable i is being updated inside your 'inner' for() loops, so by the time you reach finalPop[i, 1] <- breedingPop2, i always equals 100.
You need to use a different variable, j for example, in your inner for() loops.
finalPop=matrix(nrow=102, ncol=1)
for(i in 1:100){
SWWAyears = data.frame(iteration=rep(NA,101),pop=NA)
breedingPop <- 90000
fallMig <- .825
springMig <- .825
winterSurvival <- rbeta(100,.95,.05)
npFecund <- rbinom(100, 3.0, .9)
pFecund <- rbeta(100, .85,.25)
breedingSurvival <- rbeta(100,.95,.05)
# Set initial starting condition
SWWAyears[1,2] = breedingPop
for(years in 2:101) {
fallPop <- (SWWAyears[years-1,2]*fallMig)
for (j in 1:100){
winterPop <- (fallPop*winterSurvival[j])
}
springPop <- (winterPop*springMig)
for (j in 1:100){
summerPop <- (springPop*breedingSurvival[j])
}
for(j in 1:100){
breedingPop2 <- ((summerPop*.26)*npFecund[j])+((summerPop*.14)*pFecund[j])+(summerPop*.60)
}
SWWAyears[years,1] = years
SWWAyears[years,2] <- breedingPop2
}
finalPop[i,1] <- breedingPop2
}
Having said that, using multiple nested for() loops is generally not recommended in R; you should be able to use matrix multiplication / vectorisation to achieve the same result.
Other Issues
your values of winterPop and summerPop will only ever be fallPop * winterSurvival[100] and springPop * breedingSurvival[100] respectively. Is this what you intended?
For an assignment, we need to draw a Christmas tree in R.
I've searched the internet and found some helpful pieces of advice, but at the end of the day, I don't know how to proceed and hope someone can help me.
This is my code so far.
#ctree: prints a Christmas tree on screen with size N
ctree <- function(N){
for (i in 1:N){
width = sample("*",i,replace=T)
cat(width,sep="-","\n")
}
cat(width[1],"\n")
}
This leaves me with the middle and right side of my tree (with N=4), which is great, but not enough.
*-
*-*-
*-*-*-
*-*-*-*-
*
I planned on reversing what I had (basically right-aligning the product of the function) to create the left side, subsequently delete the rightmost column of the left side and glue it together with the right side of the tree, creating a Christmas tree.
I really hope that someone can help me achieve this! Looking forward to your advice.
Thanks in advance.
For anyone interested: this is what I ended up doing in R to create a Christmas tree.
#ctree: prints a Christmas tree on screen with amount of branch levels N
ctree <- function(N){
filler = "*"
blank = ""
for (i in 1:N){
row = c(sample(blank,N-i,replace=T),sample(filler,i,replace=T),sample(blank,N-i,replace=T))
cat(row,"\n")
}
cat(c(sample(blank,(N-1),replace=T),sample(filler,1,replace=T),sample(blank,(N-1),replace=T)),"\n")
} #ctree
This being the result! My own happy little (or big, whatever floats your boat) tree.
Here is a more succinct version:
ctree <- function(N=10){
for (i in 1:N) cat(rep("",N-i+1),rep("*",i),"\n")
cat(rep("",N),"*\n")
}
ctree()
This code came from someone else. I wish I could credit them but I have lost the source. The tree it produces is beautiful, and perhaps you could modify it for your purposes.
part <- list(x0=0,y0=0,x1=0,y1=1,
branch1=NULL,branch2=NULL,extend=NULL,
lwd=1,depth=0,col='springgreen')
par(mfrow=c(1,1),mar=c(5, 4, 4, 2) + 0.1)
segplot <- function(tree) {
if (is.null(tree)) return()
segments(tree$x0,tree$y0,tree$x1,tree$y1,
col=tree$col,
lwd=tree$lwd)
segplot(tree$branch1)
segplot(tree$branch2)
segplot(tree$extend)
}
#segplot(part)
grow <- function(tree) {
if (is.null(tree) ) return(NULL)
tree$lwd=tree$lwd*1.2
if (tree$lwd>2.5) tree$col <- 'brown'
if (is.null(tree$extend)) {
tree$extend <- list(
x0=tree$x1,
y0=tree$y1,
x1=rnorm(1,1,.03)*(2*tree$x1-tree$x0),
y1=(rnorm(1,.98,.02)+.02*(tree$x1==tree$x0))*(2*tree$y1-tree$y0),
branch1=NULL,
branch2=NULL,
extend=NULL,
lwd=1,
depth=tree$depth,
col=tree$col
)
length=sqrt((tree$x1-tree$x0)^2 + (tree$y1-tree$y0)^2)
angle <- asin((tree$x1-tree$x0)/length)
branch <- list(
x0=(tree$x1+tree$x0)/2,
y0=(tree$y1+tree$y0)/2,
branch1=NULL,
branch2=NULL,
extend=NULL,
lwd=1,
depth=tree$depth,
col=tree$col
)
shift <- rnorm(2,.5,.1)
branch$x0 <- shift[1]*tree$x1+(1-shift[1])*tree$x0
branch$y0 <- shift[1]*tree$y1+(1-shift[1])*tree$y0
length=length*rnorm(1,.5,.05)
co <- runif(1,.35,.45)
branch$x1 <- branch$x0+sin(angle+co)*length
branch$y1 <- branch$y0+cos(angle+co)*length
tree$branch1 <- branch
branch$x0 <- shift[2]*tree$x1+(1-shift[2])*tree$x0
branch$y0 <- shift[2]*tree$y1+(1-shift[2])*tree$y0
co <- runif(1,.35,.45)
branch$x1 <- branch$x0+sin(angle-co)*length
branch$y1 <- branch$y0+cos(angle-co)*length
tree$branch2 <- branch
} else {
tree$branch1 <- grow(tree$branch1)
tree$branch2 <- grow(tree$branch2)
tree$extend <- grow(tree$extend)
}
tree$depth <- tree$depth+1
if (tree$depth>2) tree$col <- 'green'
if (tree$depth>4) tree$col <- 'darkgreen'
if (tree$depth>6) tree$col <- 'brown'
tree
}
tree <- part
for (i in 1:9) tree <- grow(tree)
par(mar=c(0,0,0,0))
plot(x=c(-3,3),y=c(0,9),type='n',axes=FALSE,xlab='',ylab='')
segplot(tree)
I'm trying to make a loop, which changes the name for every iteration. The code is shown below. So what I basically need is for instance for a=2, I want W_(a-1) to refer to a matrix called W_1, W_(a) to refer to a matrix called W_1 and finally GMM_(a-1)$par should refer to GMM_(1)$par.
for (a in 2:100){
GMM <- function(beta) {
for (i in 1:(nrow(gmm_i))){
gmm_i[i,] <- g_beta(i,beta)
}
gmm_N <- t(colSums(gmm_i)) %*% W_(a-1) %*% colSums(gmm_i)
W_(a) <<- solve((1/(nrow(A)/5))*t(gmm_i)%*%gmm_i)
return(gmm_N)
}
GMM_(a)<-optim(GMM_(a-1)$par,GMM)
}
I hope my question makes sense.
Thanks.
Do you want 100 variables(a1,a2 ....) in your workspace? It will be better if you put all of this in a list as it will be easy to use. But if for some reason you do want a bunch of variables then use this,
W_1 = 0
for (a in 2:100){
GMM <- function(beta) {
for (i in 1:(nrow(gmm_i))){
gmm_i[i,] <- g_beta(i,beta)
}
gmm_N <- t(colSums(gmm_i)) %*% W_(a-1) %*% colSums(gmm_i)
tmp <- solve((1/(nrow(A)/5))*t(gmm_i)%*%gmm_i)
assign((paste("W_",a,sep="")),tmp,envir=.GlobalEnv)
return(gmm_N)
}
tmp1 <- optim(GMM_(a-1)$par,GMM)
assign((paste("GMM_",a,sep="")),tmp1,envir=.GlobalEnv)
GMM_(a)<-
}
Give W_1 appropriate value.