printing objects from a double for loop in R - r

I have code for nested for loops here. The output I would like to receive is a matrix of the means of the columns of the matrix produced by the nested loop. So, the interior loop should run 1000 simulations of a randomized vector, and run a function each time. This works fine on its own, and spits the output into R. But I want to save the output from the nested loop to an object (a matrix of 1000 rows and 11 columns), and then print only the colMeans of that matrix, to be performed by the outer loop.
I think the problem lies in the step where I assign the results of the inner loop to the obj matrix. I have tried every variation on obj[i,],obj[i],obj[[i]], etc. with no success. R tells me that it is an object of only one dimension.
x=ACexp
obj=matrix(nrow=1000,ncol=11,byrow=T) #create an empty matrix to dump results into
for(i in 1:ncol(x)){ #nested for loops
a=rep(1,times=i) #repeat 1 for 1:# columns in x
b=rep(0,times=(ncol(x)-length(a))) #have the rest of the vector be 0
Inv=append(a,b) #append these two for the Inv vector
for (i in 1:1000){ #run this vector through the simulations
Inv2=sample(Inv,replace=FALSE) #randomize interactions
temp2=rbind(x,Inv2)
obj[i]<-property(temp2) #print results to obj matrix
}
print.table(colMeans(obj)) #get colMeans and print to excel file
}
Any ideas how this can be fixed?

You're repeatedly printing the whole matrix to the screen as it gets modified but your comment says "print to excel file". I'm guessing you actually want to save your data out to a file. Remove print.table command all together and after your loops are completed use write.table()
write.table(colMeans(obj), 'myNewMatrixFile.csv', quote = FALSE, sep = ',', row.names = FALSE)
(my preferred options... see ?write.table to select the ones you like)

Since your code isn't reproducible, we can't quite tell what you want. However, I guess that property is returning a single number that you want to place in the right row/column place of the obj matrix, which you would refer to as obj[row,col]. But you'll have trouble with that as is, because both your loops are using the same index i. Maybe something like this will work for you.
obj <- matrix(nrow=1000,ncol=11,byrow=T) #create an empty matrix to dump results into
for(i in 1:ncol(x)){ #nested for loops
Inv <- rep(c(1,0), times=c(i, ncol(x)-i)) #repeat 1 for 1:# columns in x, then 0's
for (j in 1:nrow(obj)){ #run this vector through the simulations
Inv2 <- sample(Inv,replace=FALSE) #randomize interactions
temp2 <- rbind(x,Inv2)
obj[j,i] <- property(temp2) #save results in obj matrix
}
}
write.csv(colMeans(obj), 'myFile.csv') #get colMeans and print to csv file

Related

How to create a matrix or list of results using a loop?

I am performing a loop to compute the values of 4 expressions. My loop is:
for (i in c(1:14)){
VV1a <- round((db$Ya1[i]^Comb$Sigma)+ (1/(exp(log(1/p1a)^Comb$Alpha)))*
((db$Xa1[i]^Comb$Sigma)-(db$Ya1[i]^Comb$Sigma)),1)
VV1b <- round((db$Yb1[i]^Comb$Sigma)+ (1/(exp(log(1/p1b)^Comb$Alpha)))*
((db$Xb1[i]^Comb$Sigma)-(db$Yb1[i]^Comb$Sigma)),1)
VV2a <- round((db$Ya2[i]^Comb$Sigma)+ (1/(exp(log(1/p2a)^Comb$Alpha)))*
((db$Xa2[i]^Comb$Sigma)-(db$Ya2[i]^Comb$Sigma)),1)
VV2b <- round((db$Yb2[i]^Comb$Sigma)+ (1/(exp(log(1/p2b)^Comb$Alpha)))*
((db$Xb2[i]^Comb$Sigma)-(db$Yb2[i]^Comb$Sigma)),1)
}
Now for each singular, I have 2105401 values. However, using this statement each time R overwrites the elements (of course). In the end, my elements (VV1a, ....) contain only the last loop (i.e. i = 14).
How do I keep all the computation? To be more specific: ideally, for each, I would like to have a vector of the values computed.
Use a list().
Assuming that you're doing different calculations for VV1a, VV1b, etc..., you could store, for every iteration i, the resulting array as a list.
results <- list()
for (i in c(1:14)){
results[["VV1a"]][[i]] <- list(your_calculations_which_result_in_a_vector)
....
}

Storing matrix after every iteration

I have following code.
for(i in 1:100)
{
for(j in 1:100)
R[i,j]=gcm(i,j)
}
gcm() is some function which returns a number based on the values of i and j and so, R has all values. But this calculation takes a lot of time. My machine's power was interrupted several times due to which I had to start over. Can somebody please help, how can I save R somewhere after every iteration, so as to be safe? Any help is highly appreciated.
You can use the saveRDS() function to save the result of each calculation in a file.
To understand the difference between save and saveRDS, here is a link I found useful. http://www.fromthebottomoftheheap.net/2012/04/01/saving-and-loading-r-objects/
If you want to save the R-workspace have a look at ?save or ?save.image (use the first to save a subset of your objects, the second one to save your workspace in toto).
Your edited code should look like
for(i in 1:100)
{
for(j in 1:100)
R[i,j]=gcm(i,j)
save.image(file="path/to/your/file.RData")
}
About your code taking a lot of time I would advise trying the ?apply function, which
Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix
You want gmc to be run for-each cell, which means you want to apply it for each combination of row and column coordinates
R = 100; # number of rows
C = 100; # number of columns
M = expand.grid(1:R, 1:C); # Cartesian product of the coordinates
# each row of M contains the indexes of one of R's cells
# head(M); # just to see it
# To use apply we need gmc to take into account one variable only (that' not entirely true, if you want to know how it really works have a look how at ?apply)
# thus I create a function which takes into account one row of M and tells gmc the first cell is the row index, the second cell is the column index
gmcWrapper = function(x) { return(gmc(x[1], x[2])); }
# run apply which will return a vector containing *all* the evaluated expressions
R = apply(M, 1, gmcWrapper);
# re-shape R into a matrix
R = matrix(R, nrow=R, ncol=C);
If the apply-approach is again slow try considering the snowfall package which will allow you to follow the apply-approach using parallel computing. An introduction to snowfall usage can be found in this pdf, look at page 5 and 6 in particular

Running the same function multiple times and saving results with different names in workspace

So, I built a function called sort.song.
My goal with this function is to randomly sample the rows of a data.frame (DATA) and then filter it out (DATA.NEW) to analyse it. I want to do it multiple times (let's say 10 times). By the end, I want that each object (mantel.something) resulted from this function to be saved in my workspace with a name that I can relate to each cycle (mantel.something1, mantel.somenthing2...mantel.something10).
I have the following code, so far:
sort.song<-function(DATA){
require(ade4)
for(i in 1:10){ # Am I using for correctly here?
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist<-dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist<-dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist<-dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist<-dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist<-dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist<-dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist<-dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
mantel.numnotes[i]<<-mantel.rtest(coord.dist,num.notes.dist,nrepet=1000)
mantel.songdur[i]<<-mantel.rtest(coord.dist,songdur.dist,nrepet=1000)
mantel.hfreq[i]<<-mantel.rtest(coord.dist,hfreq.dist,nrepet=1000)
mantel.lfreq[i]<<-mantel.rtest(coord.dist,lfreq.dist,nrepet=1000)
mantel.bwidth[i]<<-mantel.rtest(coord.dist,bwidth.dist,nrepet=1000)
mantel.hfreqlnote[i]<<-mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
}
}
Could someone please help me to do it the right way?
I think I'm not assigning the cycles correctly for each mantel.somenthing object.
Many thanks in advance!
The best way to implement what you are trying to do is through a list. You can even make it take two indices, the first for the iterations, the second for the type of analysis.
mantellist <- as.list(1:10) ## initiate list with some values
for (i in 1:10){
...
mantellist[[i]] <- list(numnotes=mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur=mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq=mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
...)
}
return(mantellist)
In this way you can index your specific analysis for each iteration in an intuitive way:
mantellist[[2]][['hfreq']]
mantellist[[2]]$hfreq ## alternative
EDIT by Mohr:
Just for clarification...
So, according to your suggestion the code should be something like this:
sort.song<-function(DATA){
require(ade4)
mantellist <- as.list(1:10)
for(i in 1:10){
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist<-dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist<-dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist<-dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist<-dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist<-dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist<-dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist<-dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
mantellist[[i]] <- list(numnotes=mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur=mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq=mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
lfreq=mantel.rtest(coord.dist,lfreq.dist,nrepet=1000),
bwidth=mantel.rtest(coord.dist,bwidth.dist,nrepet=1000),
hfreqlnote=mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
)
}
return(mantellist)
}
You can achieve your objective of repeating this exercise 10 (or more times) without using an explicit for-loop. Rather than have the function run the loop, write the sort.song function to run one iteration of the process, then you can use replicate to repeat that process however many times you desire.
It is generally good practice not to create a bunch of named objects in your global environment. Instead, you can hold of the results of each iteration of this process in a single object. replicate will return an array (if possible) otherwise a list (in the example below, a list of lists). So, the list will have 10 elements (one for each iteration) and each element will itself be a list containing named elements corresponding to each result of mantel.rtest.
sort.song<-function(DATA){
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist <- dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist <- dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist <- dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist <- dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist <- dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist <- dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist <- dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
return(list(
numnotes = mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur = mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq = mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
lfreq = mantel.rtest(coord.dist,lfreq.dist,nrepet=1000),
bwidth = mantel.rtest(coord.dist,bwidth.dist,nrepet=1000),
hfreqlnote = mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
))
}
require(ade4)
replicate(10, sort.song(DATA))

Write out results of for-loop of distance measures in matrix form in R

Suppose I have something like the following vector:
text <- as.character(c("string1", "str2ing", "3string", "stringFOUR", "5tring", "string6", "s7ring", "string8", "string9", "string10"))
I want to execute a loop that does pair-wise comparisons of the edit distance of all possible combinations of these strings (ex: string 1 to string 2, string 1 to string 3, and so forth). The output should be in a matrix form with rows equal to number of strings and columns equal to number of strings.
I have the following code below:
#Matrix of pair-wise combinations
m <- expand.grid(text,text)
#Define number of strings
n <- c(1:10)
#Begin loop; "method='osa'" in stringdist is default
for (i in 1:10) {
n[i] <- stringdist(m[i,1], m[i,2], method="osa")
write.csv(data.frame(distance=n[i]),file="/File/Path/output.csv",append=TRUE)
print(n[i])
flush.console()
}
The stringdist() function is from the stringdist{} package but the function is also bundled in the base utils package as adist()
My question is, why is my loop not writing the results as a matrix, and how do I stop the loop from overwriting each individual distance calculation (ie: save all results in matrix form)?
I would suggest using stringdistmatrix instead of stringdist
(especially if you are using expand.grid)
res <- stringdistmatrix(text, text)
dimnames(res) <- list(text, text)
write.csv(res, "file.csv")
As for your concrete question: "My question is, why is my loop not writing the results as a matrix"
It is not clear why you would expect the output to be a matrix? You are calculating an element at a time, saving it to a vector and then writing that vector to disk.
Also, you should be aware that the arugments of write.csv are mostly useless (they are there, I believe, just to remind the user of what the defaults are). Use write.table instead
If you want to do this iteratively, I would do the following:
# Column names, outputted only one time
write.table(rbind(names(data.frame(i=1, distance=n[1])))
,file="~/Desktop/output.csv",append=FALSE # <~~ Don't append for first run.
, sep=",", col.names=FALSE, row.names=FALSE)
for (i in 1:10) {
n[[i]] <- stringdist(m[i,1], m[i,2], method="osa")
write.table(data.frame(i=i, distance=n[i]),file="~/Desktop/output.csv"
,append=TRUE, sep=",", col.names=FALSE, row.names=FALSE)
print(n[[i]])
flush.console()
}

Assigning results of a for loop to an empty matrix

I have another question for the brilliant minds out there (this site is so addictive).
I am running some simulations on a matrix and I have nested for loops for this purpose. The first creates a vector that increases by one each time a loop cycles. The nested loop is running simulations by randomizing the vector, attaching it to the matrix, and calculating some simple properties on the new matrix. (For an example, I used properties that will not vary in the simulations, but in practice I require the simulations to get a good idea of the impact of the randomized vector.) The nested loop runs 100 simulations, and ultimately I want only the column means of those simulations.
Here's some example code:
property<-function(mat){ #where mat is a matrix
a=sum(mat)
b=sum(colMeans(mat))
c=mean(mat)
d=sum(rowMeans(mat))
e=nrow(mat)*ncol(mat)
answer=list(a,b,c,d,e)
return(answer)
}
x=matrix(c(1,0,1,0, 0,1,1,0, 0,0,0,1, 1,0,0,0, 1,0,0,1), byrow=T, nrow=5, ncol=4)
obj=matrix(nrow=100,ncol=5,byrow=T) #create an empty matrix to dump results into
for(i in 1:ncol(x)){ #nested for loops
a=rep(1,times=i) #repeat 1 for 1:# columns in x
b=rep(0,times=(ncol(x)-length(a))) #have the rest of the vector be 0
I.vec=append(a,b) #append these two for the I vector
for (j in 1:100){
I.vec2=sample(I.vec,replace=FALSE) #randomize I vector
temp=rbind(x,I.vec2)
prop<-property(temp)
obj[[j]]<-prop
}
write.table(colMeans(obj), 'myfile.csv', quote = FALSE, sep = ',', row.names = FALSE)
}
The problem I am encountering is how to fill in the empty object matrix with the results of the nested loop. obj ends up as a vector of mostly NAs, so it is clear that I am not assigning the results properly. I want each cycle to add a row of prop to obj, but if I try
obj[j,]<-prop
R tells me that there is an incorrect number of subscripts on the matrix.
Thank you so much for your help!
EDITS:
Okay, so here is the improved code re the answers below:
property<-function(mat){ #where mat is a matrix
a=sum(mat)
b=sum(colMeans(mat))
f=mean(mat)
d=sum(rowMeans(mat))
e=nrow(mat)*ncol(mat)
answer=c(a,b,f,d,e)
return(answer)
}
x=matrix(c(1,0,1,0, 0,1,1,0, 0,0,0,1, 1,0,0,0, 1,0,0,1), byrow=T, nrow=5, ncol=4)
obj<-data.frame(a=0,b=0,f=0,d=0,e=0) #create an empty dataframe to dump results into
obj2<-data.frame(a=0,b=0,f=0,d=0,e=0)
for(i in 1:ncol(x)){ #nested for loops
a=rep(1,times=i) #repeat 1 for 1:# columns in x
b=rep(0,times=(ncol(x)-length(a))) #have the rest of the vector be 0
I.vec=append(a,b) #append these two for the I vector
for (j in 1:100){
I.vec2=sample(I.vec,replace=FALSE) #randomize I vector
temp=rbind(x,I.vec2)
obj[j,]<-property(temp)
}
obj2[i,]<-colMeans(obj)
write.table(obj2, 'myfile.csv', quote = FALSE,
sep = ',', row.names = FALSE, col.names=F, append=T)
}
However, this is still glitchy, as the myfile should only have four rows (one for each column of x), but actually has 10 rows, with some repeated. Any ideas?
Your property function is returning a list. If you want to store the numbers in a matrix, you should have it return a numeric vector:
property <- function(mat)
{
....
c(a, b, c, d, e) # probably a good idea to rename your "c" variable
}
Alternatively, instead of defining obj to be a matrix, make it a data.frame (which conceptually makes more sense, as each column represents a different quantity).
obj <- data.frame(a=0, b=0, c=0, ...)
for(i in 1:ncol(x))
....
obj[j, ] <- property(temp)
Finally, note that your call to write.table will overwrite the contents of myfile.csv, so the only output it will contain is the result for the last iteration of i.
Use rbind:
obj <- rbind(obj, prop)

Resources