How to run for loop R program faster? - r

I am using the following r code to compute the loglikelihood for left side and right side for each i = 1,2,...,200.
But I want to do this procedure for large number of generated dataset, for instance a = 10000 and iterate the entire loop for 1000 times. How can I speed up the the following program? Am I able to use applyfunction instead of for function?
Thank you in advance!
n1 = 100
n2 = 100
a = 1000
n= n1 + n2
# number of simulated copies of y
sim.data = matrix(NA, nrow = n, ncol = a)
for (i in 1:a) {
#for(j in 1:a){
sim.data[,i] = c(rnorm(n1, 2, 1), rnorm(n-n1, 4, 1))
#}
}
dim(sim.data)
# Compute the log-likelihood
B = ncol(sim.data)
loglike_profb = matrix(NA, n - 1, B)
for (j in 1:B) {
for (i in 1:(n - 1)) {
loglike_profb[i, j] = -0.5*(sum(((sim.data[1:i,j]) - mean(sim.data[1:i,j]))^2) + sum(((sim.data[(i + 1):n,j]) - mean(sim.data[(i +1):n,j]))^2))
}
}

You can put the calculation of the loglike_profb into a function and then use mapply
loglike_profb_func <- function(i,j){
-0.5*(sum(((sim.data[1:i,j]) - mean(sim.data[1:i,j]))^2) + sum(((sim.data[(i + 1):n,j]) - mean(sim.data[(i +1):n,j]))^2))
}
mapply(loglike_profb_func, rep(1:(n-1),B), rep(1:B,(n-1)))

Related

How to create a formula that depends on the value of another formula - R

H2 = 0
H3 - H999 = =SQRT(EXP(G3*0,1))*NORMINV(RAND();0;1)
I2 - I999 = =$E$2+$D$2*EXP($A$2*($G2-$B$2)+$C$2)/(1+EXP($A$2*($G2-$B$2)+$C$2))+H2
Hi guys,
i created the formula obove in excel.
I want to implement this one to R. Can anyone help me with this?
I already tried to do it with formula and with data.frame.
The probelm is that sigma depends on t and X(t)-1.
Can anyone help me with it?
Thanks,
Max
Edit for further question:
Simulation with sigma only depending on t in R
Simulation with sigma only depending on t in Excel
This seems to require a for loop. If the intention is the run this multiple times to get X(t), then the replications can be vectorized within the for loop:
fXt <- function(A, B, C, D, E, delta, steps, n) {
t <- seq(delta, by = delta, length.out = steps)
# pre-allocate Xt and sigma
Xt <- matrix(rep(E + D*exp(A*(t - B) + C)/(1 + exp(A*(t - B) + C)), each = n), n, steps)
sigma <- matrix(0, n, steps)
r <- matrix(exp(0.025)*rnorm(n*(steps - 1L)), n, steps - 1L) # sqrt(exp(0.05)) = exp(0.025)
for (i in 2:steps) {
sigma[,i] <- r[,i - 1L]*sqrt(abs(Xt[,i - 1L] - 50))
Xt[,i] <- Xt[,i] + sigma[,i]
}
t(Xt)
}
Call it like so:
Xt <- fXt(2, 5, 3, 30, 100, 0.05, 998, 100)
Xt will be a matrix with 998 rows and 100 columns. Each column is a different replication.

Adjusting figure margins using split.screen

I am trying to produce multiple plots using the split.screen option and I need to have 7 plots on the page. One of them should be plotted on its own and the other 6 plotted repeatedly using a for loop.
This is my code for some simulation I am carrying out. It runs well, but I have two potential problems:
I am not sure which of the plots actually gets plotted because I couldn't get the assigned label to show up on the bigger plot.
The plot showing on screen 1 is not the actual data because I have plotted it separately and know what it should look like.
Simulating the data:
numpop = 2
N = 1250
nSNP = 5000
Fst = 0.001
omega = c(0.5, 0.5)
propnExtreme = 0.1
nsim = 10
Fst.obs = vector(length = nSNP)
pdiffs = vector(length = nSNP)
genomat = matrix(nrow = N, ncol = nSNP)
for (i in 1:nSNP){
p = runif(1, 0.1, 0.9)
alpha = p * (1 - Fst) / Fst
beta = (1 - p) * (1 - Fst) / Fst
ps = rbeta(numpop, shape1 = alpha, shape2 = beta)
vars = var(ps)
pdiffs[i] = max(ps) - min(ps)
Fst.obs[i] = vars / (p * (1 - p))
for (j in 1:numpop){
ind1 = (j-1) * N * omega[j] + 1
ind2 = j * N * omega[j]
freqs = c(ps[j]^2, 2 * ps[j] * (1 - ps[j]), (1 - ps[j])^2)
genomat[ind1:ind2, i] = sample(c(0, 1, 2), size = N*omega[j], replace = TRUE, prob = freqs)
}
}
snpmeans = colMeans(genomat)
pi = (1 + colSums(genomat)) / (2 + 2*nrow(genomat))
stdmat = scale(genomat, center=snpmeans, scale=sqrt(pi*(1-pi)))
pr = prcomp(stdmat, center=F, scale=F)
Plotting:
get( getOption("device" ) )()
png(file="myplot.png", width=2000, height = 1200)
par(oma = c(0,0,3,0))
split.screen(c(1,2)) # split display into two screens
plot(pr$x,
col = c(rep("red", N*omega[1]), rep("blue", N*omega[2])),
main = "Whole genotype data")
split.screen(c(2, 3), screen = 2) # now split the second into 2x3
for(i in 1:8) ## 8=#of screens
{
screen(i) # prepare screen i for output
fA=0.5
fa = 1-fA
combined_SNP <- sample(c(0:2), N, prob=c(fA^2, 2*fA*fa, fa^2), replace=T)
pheno_indep <-c()
##Phenotypes
for (i in 1:length(combined_SNP)){
if (combined_SNP[i] == '0') {
pheno_indep<- c(pheno_indep, rnorm(1, mean = 0.07, sd = 1))
} else if (combined_SNP[i ]== '1') {
pheno_indep <- c(pheno_indep, rnorm(1, mean = 0, sd = 1))
} else {
pheno_indep <- c(pheno_indep, rnorm(1, mean = -0.07, sd = 1))
}
}
l <- 1:N
combined_indep <- cbind(combined_SNP, pheno_indep, l)
sorted_combined <- combined_indep[order(combined_indep[, 2]), ]
##eps data
f = 0.1
Nums = nrow(sorted_combined)
keep <- c(1:(f*Nums), (Nums-(f*Nums)+1):Nums)
epsdat<- c(rep("0", f*Nums), rep("1", f*Nums))
EPS_dat <- as.factor(cbind(sorted_combined[keep, ], epsdat))
dim(EPS_dat) <- c(length(keep), 4)
#colnames(EPS_dat) <- c("Genotypes", "Phenotypes", "ID", "position")
PC_EPS <- prcomp((genomat[EPS_dat[, 3], ]))
plot(PC_EPS$x,
col=c(rep("red", f*Nums), rep("blue", f*Nums)))
}
close.screen(all=TRUE)
dev.off()
Result:
I have spent a lot of time trying to figure this out even with other packages like layout.show. Thanks!
Is the following what you expect to be plotted? (I added screen title to the small plots for illustration)
When you split the screens, you should have gotten the following on your console:
> split.screen(c(1, 2))
[1] 1 2
# (code used to plot first chart on the left)
> split.screen(c(2, 3), screen = 2)
[1] 3 4 5 6 7 8
As described in the help file ?split.screen, this is a a vector of screen numbers for the newly-created screens. So your valid screen numbers are 1 (already plotted), and 3-8 (6 small screens).
As such, the next line doesn't work as expected, since you're now looping through screens 1-8 rather than screens 3-8.
# instead of
for(i in 1:8) ## 8=#of screens
# use this
for(i in 3:8) ## 8=#of screens
As a side note, you should also use different loop counters for nested loops. Your outer loop (for the 6 small plots) used i as the loop counter. Within this loop, you have another loop for phenotypes, which used i as well. Since the screen selection was done at the start of each outer loop iteration, the code still worked in this case, but in general, best to keep the loop counters separate.

How do I ask R to compile a matrix column by column?

I am trying to generate a matrix sz by first applying a binomial, then adding values from the corresponding column of pombe_new_subs and this combined value being input as size for the following column.
After many frustrations, the following code is what I've ended up with and it just doesn't work - problems I'm coming across are;
# Error in sz[j, i + 1] = sz[, i] + pombe_new_subs[, i] :
# number of items to replace is not a multiple of replacement length
pombe_new_subs <- rmultinom(3, 15, prob = c(0.3, 0.3, 0.3))
randomdiv <- function(nchrom, ndivs, size) {
sz <- matrix(nrow = nchrom, ncol = ndivs)
for (j in 1:nchrom) {
n <- size
for (i in 1:ndivs) {
n <- rbinom(1, n, 0.5)
sz[j,i] <- n
}
sz[j,i+1] = sz[ ,i] + pombe_new_subs[ ,i]
sz[j, i+1] <- n
}
return (sz)
}
randomdiv(3, 3, 10)
I know this is probably a fairly simple looping exercise but frustration has entirely taken over.

How to regenerate fresh matrix when replicating

I have the following script:
randomdiv <- function(ncells, ndivs, size, accuracy) { sz <- matrix(nrow = ncells, ncol = ndivs)
for (j in 1:ncells) {
total_subunits <- size
for (i in 1:ndivs)
{
accurate_subunits <- (size * accuracy)
random_subunits <- round(size - accurate_subunits)
random_inh <- rbinom(1, random_subunits, 0.5)
accurate_inh <- (accurate_subunits / 2)
total_inh <- 2 * (random_inh + accurate_inh)
sz[j,i] <- total_inh
total_subunits <- total_inh
}
}
return (do.call(rbind, replicate(100, sz, simplify = FALSE)))
}
Such that I thought randomdiv(5, 20, 10, 0) would return a matrix with 500 rows, where the original sz matrix had been replicated 100 times. In fact, this is the case. However, the replicates are identical rather than each replicate being a fresh generation of data, which is what I need.
Any ideas how I can make sure that each replicate is a new matrix, not literally a replicate of the first one to be generated?

replacement and two for loop in R

I would like to have a vector res with length 200, which includes 20 times repetition of random generation values divided by 2 which is r[i], how can I get this in R?I wrote the following code but it is just save each iteration values,not the whole iterations.
r = rep(0, 10)
res = matrix(0, nrow=200, ncol=1)
for(j in 1:20){
for(i in 1:10){
x = rnorm(10, 0, 1)
r[i] = x/2
}
res = rbind(r)
}
as Roland said in a comment to your question writing two loops for this isn't a good practice. However, you can do it like this
res = rep(0, 200)
r = rep(0, 10)
for(j in 1:20){
for(i in 1:10){
x = rnorm(1, 0, 1)
r[i] = x/2
}
res[((j-1)*10+1):(j*10)] = r
}
As for your solution, there were some problems:
There is no need to define a matrix res = matrix(0, nrow=200, ncol=1) if you only need a vector
rnorm(10,0,1) returns a vector of 10 values so assigning it to r[i] (which takes only one value) isn't correct
rbind is used to connect two vectors/matrices/... by rows so using it with only one parameter doesn't really make a sense here

Resources