print to file in a loop 'R' - r

I am a beginner to 'R'. I have a loop where I shift data frames, merge them and then run a regression:
testsequence = seq(60,120000, by=60)
for(n in 1:length(testsequence)){
dfshift<-tail(df, (nrow(df)-testsequence[n]))
df1shift<-head(df1, (nrow(df1)-testsequence[n]))
dftogether<-cbind(dfshift,df1shift)
lm1<-lm(LPGT~Temp, data=dftogether)
write.table(lm1, file = "OUTPUT_Sensitivity_Results.csv")
}
The last line triggers this error message:
"Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ""lm"" to a data.frame"
Any ideas? Also, I would like to structure it so that I don't overwrite my output file for each iteration of the loop. I saw the thread that suggested the following:
means <- sapply(filename, function(x) mean(as.numeric(read.table(x,header=FALSE)$V4)))
And then write the file as a whole with:
write.csv(data.frame(fname=filename,mean=means),file="output.csv")
but I'm not sure how to apply it to my case.
Any help would be appreciated!
Sonja

If you want the lines that appear at the console as a result of the implicit Print in the REPL that runs at the top level of R, then use this instead:
write( capture.output(print(lm1)),"\n",
file="OUTPUT_Sensitivity_Results.txt",
append=TRUE)
Note that I changed the file type so you would not think that it was a CSV file.

Related

I found "Error: `path` must be a string" when i tried to run subscript of a loop in R using readxl package. Help would be appreciated

I have tried to run the script of loop that opens every one of the files and adds them to the blank data frame.
fileNames <- list.files(pattern=".xlsm", recursive = TRUE)
fileNumbers <- seq(fileNames)
excel.data <- read_excel(fileNames[fileNumber],
skip=1,
col_names = T,
sheet = "Output")`
It is a part of loop:
for (fileNumber in fileNumbers){
tryCatch({
# read in the data from excel.
excel.data <- read_excel(fileNames[fileNumber],
skip=1,
col_names = T,
sheet = "Output")........}).
Loops run correct but gives blank column for the excel combined to new data frame. When i try to run individual subscript as above, it says "Error: path must be a string" in console.

Exporting a list of dataframes as csv

I have a list of dataframes which I want to export as csv. I tried :
for (i in listofdf){
write.csv(listofdf[i], file = paste(names(listofdf)[i],".csv", sep=""), row.names = FALSE)
}
and got : Error in listofdf[i] : invalid subscript type 'list'. I read that I am feeding a list data type into a process that expects a vector, and tried to apply the given solution : unlist(listofdf), but all I got is a massive list of numeric values that I don't know how to deal with.
Then I tried a solution found here, that works with the given example, but when I try to apply :
sapply(names(listofdf),
function (x) write.table(listofdf[x],
file = paste(names(listofdf)[x],".csv", sep=""),
row.names = FALSE))
but when I try it, it only exports one file named NA.csv. Do you know how to make any of those solutions work?
Your problem is how you're indexing your list object and names(listofdf)[i] isn't doing what you're thinking. Try This:
listofdf <- list(a = iris, b = iris, c = iris)
for (i in seq_along(listofdf)){
write.csv(listofdf[[i]], file = paste0(names(listofdf)[i], ".csv"), row.names = F)
}
Side note: the default separator for paste is a space. So you're putting a space before the ".csv" extension with your code. paste0 does not paste strings together with a space.
Alternatively, as mentioned you can use writexlsx by simply:
library(writexl)
write_xlsx(listofdfs, "output.xlsx")
This will create a file called "output.xlsx" with sheets that match names(listofdfs) and the proper values stored within those sheets.

Issues with user function/ map to read in and combine DBF files in R

I have written a function to read in a set of dbf files. Unfortunately, these files are very large, and I wouldn't want anyone to have to run them on my behalf.
readfun_dbf = function(path) {
test = read.dbf(path, as.is = TRUE) # dont convert to factors
test
}
dbfiles identifies the list of file names. map_dfr applies my function to the list of files and row binds them together. I've used very similar code to read in some text files, so I know the logic works.
dbfiles = list.files(pattern = "assign.dbf", full.names = F, recursive = T)
dbf_combined <- map_dfr(dbfiles, readfun_dbf)
When I run this, I get the error:
Error: Column `ASN_PCT` can't be converted from integer to character
So I ran the read.dbf command on all the files individually and noticed that some dfb files were being read in with all their feilds as characters, and some were being read in with some as integers and characters. I figured that map_dfr needs the fields to be of the same type to bind them, so I added the mutate_all command to my function--but it's still throwing the same error.
readfun_dbf = function(path) {
test = read.dbf(path, as.is = TRUE) # dont convert to factors
**mutate_all(test,as.character)**
test
}
Do you think the mixed field types are the issues? Or could it be something else? Any suggestions would be great!
Assign the value back to the object.
readfun_dbf = function(path) {
test = read.dbf(path, as.is = TRUE)
test <- dplyr::mutate_all(test,as.character)
return(test)
}
and then try :
dbf_combined <- purrr::map_dfr(dbfiles, readfun_dbf)

Error in file conection while creating loop to read multiple csv files in a same directory

So, Im creating a loop in R that reads through multiple csv files in a directory called "specdata", and afterwards, tells you the mean of a particular colum in common inside those files. This function is represented in the next parragraph the arguments you specify are the directory in which those files are located, the colum you want means to be calculated, and id sequence, that tells you how many files do you want to read depending of de object number represented throudh subsetting []
HERE IS THE FUNCTION:
pollutantmean <- function(directory,pollutant,id) {
for (i in id) {archivo <- list.files(directory)[i]
file(archivo[i])
datapollution <- read.csv(archivo[i],header = TRUE)
datamatrix <- data.matrix(datapollution)
mean(datamatrix[pollutant],na.rm = TRUE)}}
the problem is that when the function is called:
pollutantmean("specdata",sulfurate,1:15)
it gives the following error message:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
Show Traceback
Rerun with Debug
Error in file(file, "rt") : cannot open the connection
The interesting part is that the error does not occur when you call the part of the function that gives the error indepently of the function, like this:
file(list.files("specdata")[2])
in this case, It gives the desired conection, later when you apply read.csv("specdata")[2] it works perfectly also.
So here is my question, what Im I missing? It should be conecting and reading all the files the same way it does when the subsetting is on [2] , but replacing the number 2 with the respective i, looping through the function and making me happy. Why does it give an error here but not when subsetting on 2 is executed?
I kind of read somewhere that I have to use Rbind, but either way that would be after generating the conection and reading the files listed, I need to solve this first warning message before that ( not sure how I would do it afterwards...).
Yep, im from coursera, sorry to be a cliche, but im a really nice guy PLEASE HELP :)
files <- list.files(directory, full.names = TRUE, pattern = ".csv") # be sure your working directoy contains this data
pollutantmean <- function(directory, pollutant, id) {
for (i in id) {
datapollution <- read.csv(files[i], header = TRUE, stringsAsFactors = FALSE)
datamatrix <- data.matrix(datapollution)
mean(datamatrix[pollutant],na.rm = TRUE)}
}
pollutantmean("specdata",sulfurate,1:15)
so it worked, just adding full.names = TRUE, eliminating the files function, and elimating i on the subsetting of list.files did the trick on solving that problem.
function(directory,pollutant,id) {
for (i in id) {archivo <- list.files(directory,full.names = TRUE)
datapollution <- read.csv(archivo[i],header = TRUE)
datamatrix <- data.matrix(datapollution)
resultmean <- mean(datamatrix[pollutant],na.rm = TRUE)}
print(resultmean)}
I would like to understand though:
What does the full.names = TRUE argument on list.files function actually do?
Why is no file() function needed? is the conection generated atuomatically with list.files()?

Write col names while writing csv files in R

What is the proper way to append col names to the header of csv table which is generated by write.table command?
For example write.table(x, file, col.names= c("ABC","ERF")) throws error saying invalid col.names specification.Is there way to get around the error, while maintaining the function header of write.table.
Edit:
I am in the middle of writing large code, so exact data replication is not possible - however, this is what I have done:
write.table(paste("A","B"), file="AB.csv", col.names=c("A1","B1")) , I am still getting this error Error in write.table(paste("A","B"), file="AB.csv", col.names=c("A", : invalid 'col.names' specification.
Is that what you expect, tried my end
df <- data.frame(condition_1sec=1)
df1 <- data.frame(susp=0)
write.table(c(df,df1),file="table.csv",col.names = c("A","B"),sep = ",",row.names = F)

Resources