How do I do a loop using R - r

I have to create a loop but I don't know how to order to R what I want to do.
for(i in 1:nrow(File1))
for(j in 1:ncol(File2)){
if [(x(i,1)==(cd(1,j)))] # Until here I think it is ok
THEN # I don't know what is the command for THEN in R
for (k in File3) #I have to take all the data appearing in File3
Output (k,1)= K # I don't know what is the command to order the output in R
Output (k,2)= cd(1,j)
Output (k,3)= x(i,2)
Output (k,4)= x(i,3)
Output (k,5)= x(i,4)
Output (k,6)= cd(1,j)
How I have to finish the loop?
Thanks in advance, I'm a bit confused

So this is a basic for-loop, which just prints out the values.
data <- cbind(1:10);
for (i in 1:nrow(data)) {
print(i)
}
If you want to save the output you have to initialize a vector / list /matrix, etc.:
output <- vector()
for (i in 1:nrow(data)) {
k[i] <- i
}
k
And a little example for nested loops:
data <- cbind(1:5);
data1 <- cbind(15:20)
data2 <- cbind(25:30)
for (i in 1:nrow(data)) {
print(paste("FOR 1: ", i))
for (j in 1:nrow(data1)) {
print(paste("FOR 2: ", j))
for (k in 1:nrow(data2)) {
cat(paste("FOR 3: ", k, "\n"))
}
}
}
But as already mentioned, you would probably be better of using an "apply"-function (apply, sapply, lapply, etc). Check out this post: Apply-Family
Or using the package dplyr with the pipe (%>%) operator.
To include some if/else-synthax in the loop:
data <- cbind(1:5);
data1 <- cbind(15:20)
data2 <- cbind(25:30)
for (i in 1:nrow(data)) {
if (i == 1) {
print("i is 1")
} else {
print("i is not 1")
}
for (j in 1:nrow(data1)) {
print(paste("FOR 2: ", j))
for (k in 1:nrow(data2)) {
cat(paste("FOR 3: ", k, "\n"))
}
}
}
In the first loop, I am asking if i is 1.
If yes, the first print statement is used ("i is 1"), otherwise the second is used ("i is not 1").

R Repeat loop Statements
R While loop executes a set of statements repeatedly in a loop as long as the condition is satisfied. We shall learn about the syntax, execution flow of while loop with R example scripts
Reference Site

Related

Make a for loop run faster in R

I want to create a model where I duplicate a sentence several times, introducing random error each time. The duplicates of the sentence also get duplicated. So, in cycle one, I start with just "example_sentence". In cycle two, I have two copies of that sentence. In cycle three, I have 4 copies of that sentence. I want to do this for 25 cycles with 20k sentences. The code I wrote to do that works way too slowly, and I am wondering if there is a way to make my nested for loops more efficient? Here is the part of the code that is the slowest:
alphabet <- c("a","b","d","j")
modr1 <- "sentencetoduplicate"
errorRate <- c()
errorRate <- append(errorRate, rep(1,1))
errorRate <- append(errorRate, rep(0,999))
duplicate <- c(modr1)
for (q in 1:25) {
collect <- c()
for (z in seq_along(duplicate)) {
modr1 = duplicate[z]
compile1 <- c()
for (k in 1:nchar(modr1)) {
error <- sample(errorRate, 1, replace = TRUE)
if (error == 1) {
compile1 <- append(compile1, sub(substring(modr1,k,k),sample(alphabet,1,replace=TRUE),substring(modr1,k,k)))
} else {
compile1 <- append(compile1, substring(modr1,k,k))
}
}
modr1 <- paste(compile1, collapse='')
collect <- append(collect, modr1)
}
duplicate <- append(duplicate, collect)
}
Here is a faster approach to your loop, but I think the problem of applying this to your problem of 20K sentences remains!
f <- function(let, alphabet = c("a","b","c","d","j"),error_rate=1/1000) {
lenlet=length(let)
let = unlist(let)
k <- rbinom(length(let),1,prob = error_rate)
let[k==1] <- sample(alphabet,size = sum(k==1), replace=T)
return(as.list(as.data.frame(matrix(let, ncol=lenlet))))
}
modr1 <- "sentencetoduplicate"
k <- data.table(list(strsplit(modr1,"")[[1]]))
for(q in 1:25) {
k[, V1:=list(f(V1))]
k <- k[rep(1:nrow(k),2)]
}
Updated with slightly faster version! (Notice this is no longer by=1:nrow(k))

Break out of a multi-nested for-loop to the outer most rasters

I have a multi-nested for-loop that I need to restart the entire loop once the last nest (here, clip.groups) is complete. I have tried several options. Each layer involves rasters and I cannot vectorize it through apply, etc. Since there are so many input files, it is not reproducible.
The basic structure though is this:
clip.groups <- c('Bay area','Alameda County','Oakland','West and Downtown Oakland')
rate.groups <- c('co.25','cbg.25')
conc.groups <- c('ppb', 'ug')
pop.groups <- c('pop.ls.night.25')
beta.groups <- c(0.001105454,0.000318195,0.001881231)
for (j in 1:length(conc.groups)){
for (i in 1:length(beta.groups)){
for (k in 1:length(rate.groups)){
for (h in 1:length(pop.groups)){
for (m in 1:length(clip.groups)){
break #==== THIS IS WHERE I NEED IT TO GO BACK TO THE OUTER MOST LOOP - (conc.groups j)
}
}
}
}
}
}
}
If you go back to the outermost loop than the inbetween loops are meaningless. That is you get this
clip.groups <- c('Bay area','Alameda County','Oakland','West and Downtown Oakland')
rate.groups <- c('co.25','cbg.25')
conc.groups <- c('ppb', 'ug')
pop.groups <- c('pop.ls.night.25')
beta.groups <- c(0.001105454,0.000318195,0.001881231)
for (j in 1:length(conc.groups)){
beta.groups[1]
rate.groups[1]
pop.groups[1]
for (m in 1:length(clip.groups)){
cat(j, "-", m, "\n")
}
}

R - Saving the values from a For loop in a vector or list

I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}

How to vectorize from if to ifelse with multiple statements?

I just read that vectorization increases performance and lowers significantly computation time, and in the case of if() else , best choice is ifelse().
My problem is I got some if statements inside a for loop, and each if statement contains multiple assignments, like the following:
x <- matrix(1:10,10,3)
criteria <- matrix(c(1,1,1,0,1,0,0,1,0,0,
1,1,1,1,1,0,0,1,1,0,
1,1,1,1,1,1,1,1,1,1),10,3) #criteria for the ifs
output1 <- rep(list(NA),10) #storage list for output
for (i in 1:10) {
if (criteria[i,1]>=1) {
output1[[i]] <- colMeans(x)
output1[[i]] <- output1[[i]][1] #part of the somefunction output
} else {
if(criteria[i,2]>=1) {
output1[[i]] <- colSums(x)
output1[[i]] <- output1[[i]][1] #the same
} else {
output1[[i]] <- colSums(x+1)
output1[[i]] <- output1[[i]][1] #the same
}}}
How can I translate this into ifelse?
Thanks in advance!
Note that you don't need a for loop as all operations used are vectorized:
output2 <- ifelse(criteria[, 1] >= 1,
colMeans(x)[1],
ifelse(criteria[, 2] >= 1,
colSums(x)[1],
colSums(x+1)[1]))
identical(output1, as.list(output2))
## [1] TRUE
At least you can convert two assignments into one. So instead of
output[[i]] <- somefunction(arg1,arg2,...)
output[[i]] <- output[[i]]$thing #part of the somefunction output
you can refer directly to the only part you are interested in.
output[[i]] <- somefunction(arg1,arg2,...)$thing #part of the somefunction output
Hope that it helps!
It seems I found the answer trying to build the example:
output2 <- rep(list(NA),10) #storage list for output
for (i in 1:10) {
output2[[i]] <- ifelse(criteria[i,1]>=1,
yes=colMeans(x)[1],
no=ifelse(criteria[i,2]>=1,
yes=colSums(x)[1],
no=colSums(x+1)[1]))}

Making a nested for loop run faster in R

I have the following code (nested for loop) in R which is extremely slow. The loop matches values from two columns. Then picks up a corresponding file and iterates through the file to find a match. Then it picks up that row from the file. The iterations could go up to more than 100,000. Please if some one can provide an insight on how to quicken the process.
for(i in 1: length(Jaspar_ids_in_Network)) {
m <- Jaspar_ids_in_Network[i]
gene_ids <- as.character(GeneTFS$GeneIds[i])
gene_names <- as.character(GeneTFS$Genes[i])
print("i")
print(i)
for(j in 1: length(Jaspar_ids_in_Exp)) {
l <- Jaspar_ids_in_Exp[j]
print("j")
print(j)
if (m == l) {
check <- as.matrix(read.csv(file=paste0(dirpath,listoffiles[j]),sep=",",header=FALSE))
data_check <- data.frame(check)
for(k in 1: nrow(data_check)) {
gene_ids_JF <- as.character(data_check[k,3])
genenames_JF <- as.character(data_check[k,4])
if(gene_ids_JF == gene_ids) {
GeneTFS$Source[i] <- as.character(data_check[k,3])
data1 <- rbind(data1, cbind(as.character(data_check[k,3]),
as.character(data_check[k,8]),
as.character(data_check[k,9]),
as.character(data_check[k,6]),
as.character(data_check[k,7]),
as.character(data_check[k,5])))
} else if (toupper(genenames_JF) == toupper(gene_names)) {
GeneTFS$Source[i] <- as.character(data_check[k,4])
data1 <- rbind(data1, cbind(as.character(data_check[k,4]),
as.character(data_check[k,5]),
as.character(data_check[k,6]),
as.character(data_check[k,7]),
as.character(data_check[k,8]),
as.character(data_check[k,2])))
} else {
# GeneTFS[i,4] <- "No Evidence"
}
}
} else {
# GeneTFS[i,4] <- "Record Not Found"
}
}
}
If you pull out the logic for processing one pair into a function, f(m,l), then you could replace the double loop with:
outer(Jaspar_ids_in_Network, Jaspar_ids_in_Exp, Vectorize(f))

Resources