R save data.frame each iteration - r

I have a loop for a data frame construction, and I would like to write all small pieces at each iteration in a csv file. Something like a rbind() but in a file...
I have seen sink() like that
exemple :
sink("resultat/output.txt")
df.total<-NULL
for(i in 1:length(list2.1)){
if(i%%100==0){print(i)}
tps<-as.data.frame(list2.1[i])
tps<-cbind(tps,as.data.frame(list2.2[i]))
colnames(tps)<-c("slope","echange")
tps$Run<-rep(data$run_nb[i],length(tps$slope))
tps$coefSlope<-rep(data$coef_Slope_i[i],length(tps$slope))
tps$coefDist<-rep(data$coef_dist_i[i],length(tps$slope))
sink(tps)
df.total<-rbind(df.total,tps)
}
sink()
write.csv(df.total,"resultat/df_total.csv")
but I don't think it work for my case ...
any suggestions

You can use sink (which redirects R output to a connection) along with print
df <- data.frame(a = 0:1, b = 1:2)
sink("output.txt")
for(i in 1:10) {
print(df[i+2, ] <- c(sum(df$a), tail(df$b, 1) + 1))
## Or, to save the whole data.frame each time: print(df)
}
sink()
Another option is to use cat
df <- data.frame(a = 0:1, b = 1:2)
for(i in 1:10) {
cat(df[i+2, ] <- c(sum(df$a), tail(df$b, 1) + 1), "\n", file="output.txt" append=TRUE)
}
You can also use write.table to save the whole data.frame
df <- data.frame(a = 0:1, b = 1:2)
for(i in 1:10) {
df[i+2, ] <- c(sum(df$a), tail(df$b, 1) + 1)
write.table(df, file="output.txt", append=TRUE)
}
If you set append = FALSE only the last iteration will be saved.

Related

How to loop over row values in a two column data frame in R?

I have a data frame that looks like:
mydata <- data.frame(name = c("Rick", "Dan", "Michelle", "Ryan", "Gary"),
id = c (1:5))
I want to use the loop over the row and pull out the name and id variables for each. This is used to export a GTiff file. The important part is looping over each value. I've shown below how I could do it one by one using the paste functions to import the names as strings where necessary. In this case I would have 5 geotiff files, one for each name.
head(mydata)
x <- paste(mydata[1, 1])
x
y <- paste0(x, ".asc")
y
z <- paste(mydata[1, 2])
z
species_raster <- raster(y)
m <- c(0, as.numeric(z), 0, as.numeric(z), 1, 1)
rclmat <- matrix(m, ncol = 3, byrow = TRUE)
rc <- reclassify(species_raster, rclmat)
plot(rc)
writeRaster(rc,
filename = x,
format = "GTiff",
overwrite = TRUE)
you can get a list of your pasted filenames with this
outputnames <- lapply(mydata[,1], paste0, ".asc")
#OR
outputnames <- lapply(mydata$name, paste0, ".asc")
These can later be used in another apply function or be referenced in a loop like so
for(i in mydata){
writeRaster(rc,
filename = i,
format = "GTiff",
overwrite = TRUE)
}
another alternative of referencing your dataframe in a loop is the following
for(i in 1:nrow(mydata)){
filename <- paste(mydata$name[i], ".asc")
print(filename)
Idascharacter <- as.character(mydata$id[i])
print(Idascharacter)
}

R loop to create data frames with 2 counters

What I want is to create 60 data frames with 500 rows in each. I tried the below code and, while I get no errors, I am not getting the data frames. However, when I do a View on the as.data.frame, I get the view, but no data frame in my environment. I've been trying for three days with various versions of this code:
getDS <- function(x){
for(i in 1:3){
for(j in 1:30000){
ID_i <- data.table(x$ID[j: (j+500)])
}
}
as.data.frame(ID_i)
}
getDS(DATASETNAME)
We can use outer (on a small example)
out1 <- c(outer(1:3, 1:3, Vectorize(function(i, j) list(x$ID[j:(j + 5)]))))
lapply(out1, as.data.table)
--
The issue in the OP's function is that inside the loop, the ID_i gets updated each time i.e. it is not stored. Inorder to do that we can initialize a list and then store it
getDS <- function(x) {
ID_i <- vector('list', 3)
for(i in 1:3) {
for(j in 1:3) {
ID_i[[i]][[j]] <- data.table(x$ID[j:(j + 5)])
}
}
ID_i
}
do.call(c, getDS(x))
data
x <- data.table(ID = 1:50)
I'm not sure the description matches the code, so I'm a little unsure what the desired result is. That said, it is usually not helpful to split a data.table because the built-in by-processing makes it unnecessary. If for some reason you do want to split into a list of data.tables you might consider something along the lines of
getDS <- function(x, n=5, size = nrow(x)/n, column = "ID", reps = 3) {
x <- x[1:(n*size), ..column]
index <- rep(1:n, each = size)
replicate(reps, split(x, index),
simplify = FALSE)
}
getDS(data.table(ID = 1:20), n = 5)

R, define a function then apply to a list

I am trying to write a function (and I am new to R, most of my knowledeges of R were learned form this wedsite, thanks),
I want to apply my function to a list. The list contain some ".CSV" files.
All CSV files in my folder look like the picture below, same structure but with different column numbers.
I want to :
based on "Frame" column, delete all the row contain words "T",
then I got "110*n1" rows data.
delete all the column contain ""Flag" words, they are blank column.
delete the 1st column. then I have "2*n2" columns.
reshape the mulit-column to 2 column data, now I got "110*n3" rows data.
repeat "1,2,3,4,...,110" as seires numbers, n times(n=n3), rebind as a column.
form "1,2,3,...,n3", each repeat 110 times, make as a colum.
export the new table as txt files.
Here is what I've done so far:
T_function <- function(x) {
data.df <- read.csv(x, skip = 1,header=TRUE, na.strings=c("NA","NaN", " ","*"),
dec=".", strip.white=TRUE)
filename <- substr(x = x, start = 1, stop = (nchar(x)-4))
data.df[!grepl("T", data.df$Frame),]
data.df <- data.df [,-1]
data.df <- data.df [,colSums(is.na(data.df))<nrow(data.df)]
splitter <- function(indf, ncols) {
if (ncol(indf) %% ncols != 0) stop("Not the right number of columns to split")
inds <- split(sequence(ncol(indf)), c(0, sequence(ncol(indf)-1) %/% ncols))
temp <- unlist(lapply(inds, function(x) c(t(indf[x]))), use.names = FALSE)
as.data.frame(matrix(temp, ncol = ncols, byrow = TRUE))
}
out <- splitter(data.df, 2)
list <- 1:110
from <- which(out$V1 == 1)
to <- c((from-1)[-1], nrow(out))
end <- c(to/110)
list2 <- rep(list,length(to/110))
out$Number <- unlist(list2)
out$Number <- as.factor(out$Number)
list3 <- rep(1:end,each=110)
out$slice <- unlist(list3)
out$slice <- as.factor(out$slice)
write.table(x = data.df,
file = paste0(filename, "_analysis.txt"),
sep = ",",quote=F)
}
It seems the function can not add correct "out$Number" and "out$slice".
filenames <- list.files(path = "",pattern="csv",full.names = T)
sapply(filenames, FUN = T_function)
I am trying to apply my function to all files in list, while it seems beside the 1st files I can't get other files to work.
Could anybody help me find out and salve problems?

Appending a data frame with for if and else statements or how do put print in dataframe

How do I put what I printed in a dataframe with a for loop and if else statements?
Basically, this code:
list<-c("10","20","5")
for (j in 1:3){
if (list[j] < 8)
print("Greater")
else print("Less")
})
#[1] "Less"
#[1] "Less"
#[1] "Greater"
Or should it be something more like this?
f3 <- function(n){
df <- data.frame(x = numeric(n), y = character(n), stringsAsFactors = FALSE)
for(j in 1:3){
if (list[j] < 8)
df$x[j] <- j
df$y[j] <- toString(Greater)
else
df$y[j] <- toString(Less)
}
df
}
It's generally not a good idea to try to add rows one-at-a-time to a data.frame. it's better to generate all the column data at once and then throw it into a data.frame. For your specific example, the ifelse() function can help
list<-c(10,20,5)
data.frame(x=list, y=ifelse(list<8, "Greater","Less"))

add row to a frame in a for loop

I have some code that creates a dataframe with 2 coulmns I want to write data from a forloop to this dataframe ...how do I do that?
df<-data.frame(id = numeric(), nobs = numeric())
setwd(directory)
files <-list.files(directory)
files <-files[id]
for (i in files) {
#print(i)
file <- read.csv(i)
x <- nrow(file)
num = as.numeric(gsub(".csv","",i))
y <- sprintf("%i %i", num, x)
#print(y)
df <- rbind(df,num,x)
}
To add rows in a data.frame using a loop you can modify your code using the following one:
df<-data.frame(id = numeric(), nobs = numeric())
for (i in 1:1000) {
df[i,] <- c(runif(1),runif(1))
}
However, if you know the number of rows needed then preallocating memory is strongly recommended:
files <- 1:1000
df<-data.frame(id = numeric(length(files)), nobs = numeric(length(files)))
for (i in 1:length(files)) {
df[i,] <- c(runif(1),runif(1))
}

Resources