I'm trying to read a txt file and create an xlsx out of it. and here is my code.
setwd("~/Downloads/Test")
FILES <- list.files( pattern = ".txt")
library(openxlsx)
for (i in 1:length(FILES)) {
FILE=read.delim(FILES[i],header=F,stringsAsFactors=FALSE,sep = "\t",, quote = "")
colnames(FILE) <- c("A","B","C","D","E","F")
write.csv(FILE,file="Dummy.xls")
}
when I run this piece of code, I'm able to create an Excel file with xls extension, but when I try to do the same with xlsx, by changing file="Dummy.xlsx", the file is getting created, but when I open it, I'm getting the below error.
please let me know where am I going wrong and how can I fix this.
Thanks
Related
I'm attempting to convert a large number of .xlsx files to .csv, while also specifying a new folder or directory for them to be placed in. Specifically, I want to create a new folder in my working directory to house the newly-converted .csv files.
Based on previous examples, I have managed to complete the conversion portion using the following code
setwd("~/Myfolder")
files.to.read = list.files(pattern="xlsx")
lapply(files.to.read, function(f) {
df = read.xlsx(f, sheetIndex=1)
write.csv(df, gsub("xlsx", "csv", f), row.names=FALSE)})
This successfully converts all .xlsx files to .csv in my original working directory. However, what I want is to create a new subfolder within that directory and place those .csv files in it. I know the answer likely involves adding either
dir.create()
or
file.path() to the write.csv() command. However, when I use either of them, I get the following error.
Error in file(file, ifelse(append, "a", "w")) : invalid 'open' argument
It's hard to know without a reproducible example. What happens if you try to do read.xlsx(files.to.read[1], sheetIndex=1)?
If that works, you are quite close.
dir.create("your_folder_name")
files.to.read = list.files(pattern="xlsx")
lapply(files.to.read, function(f) {
df = read.xlsx(f, sheetIndex=1)
# Make the new filename here
new_filename = file.path(getwd(), "your_folder_name", gsub("xlsx", "csv", f))
write.csv(df, new_filename , row.names=FALSE)
# provide some feedback
print(paste("Writing", new_filename))
}
)
It might be that your list.files() command is having trouble.
If the previous fails, Try:
# Mind the full.names=TRUE to get the full path
files.to.read = list.files(pattern="xlsx", full.names=TRUE)
And get rid of the new_filename line. You won't need to create it via file.path, just use the gsub command as you were doing.
I have several files in a folder. They all have same layout and I have extracted the information I want from them.
So now, for each file, I want to write a .csv file and name it after the original input file and add "_output" to it.
However, I don't want to repeat this process manually for each file. I want to loop over them. I looked for help online and found lots of great tips, including many in here.
Here's what I tried:
#Set directory
dir = setwd("D:/FRhData/elb") #set directory
filelist = list.files(dir) #save file names into filelist
myfile = matrix()
#Read files into R
for ( i in 1:length(filelist)){
myfile[i] = readLines(filelist[i])
*code with all calculations*
write.csv(x = finalDF, file = paste (filename[i] ,"_output. csv")
}
Unfortunately, it didn't work out. Here's the error message I get:
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
In addition: Warning message:
In myfile[i] <- readLines(filelist[i]) :
number of items to replace is not a multiple of replacement length
And 'report2016-03.txt' is the name of the first file the code should be executed on.
Does anyone know what I should do to correct this mistake - or any other possible mistakes you can foresee?
Thanks a lot.
======================================================================
Here's some of the resources I used:
https://www.r-bloggers.com/looping-through-files/
How to iterate over file names in a R script?
Looping through files in R
Loop in R loading files
How to loop through a folder of CSV files in R
This worked for me. I used a vector instead of a matrix, took out the readLines() call and used paste0 since there was no separator.
dir = setwd("C:/R_projects") #set directory
filelist = list.files(dir) #save file names into filelist
myfile = vector()
finalDF <- data.frame(a=3, b=2)
#Read files into R
for ( i in 1:length(filelist)){
myfile[i] = filelist[i]
write.csv(x = finalDF, file = paste0(myfile[i] ,"_output.csv"))
}
list.files(dir)
I am trying to read in a number of Excel files into R using read.xlsx using the xlsx package but when I do so I am getting the following error:
Error in loadWorkbook(file) : Cannot find id100.xlsx
First I list the files in the directory:
> files <- list.files(datDir, pattern = ".xlsx")
Then I use read.xlsx to read them all in:
for (i in seq_along(files)) {
assign(paste("id", i, sep = "."), read.xlsx(files[i],1,as.data.frame=TRUE,
header=FALSE, stringsAsFactors=FALSE, na.strings=" "))
}
I checked to see if the file was even in the list and it is:
> files
[1] "id100.xlsx" "id101.xlsx" etc...
> files[1]
[1] "id100.xlsx"
I have used this code many times before today and for some reason it is just not working. I keep getting that error. Does anyone have any suggestions?
Thanks!
If your working directory is different from datDir you should use full.names=T like this:
files <- list.files(datDir, pattern = ".xlsx",full.names=T)
I'm using the following code:
setwd("~/R/Test")
require(openxlsx)
file_list <- list.files(getwd())
for (file in file_list){
file = read.xlsx(file)
write.csv(file,file=file)
}
Where it opens each file in a directory, reads the excel file, and saves as a CSV. However, I'm trying to source the original file name, and save the CSV with the original file name. Is there a way to do this?
Thanks!
As pointed out in the comments, you're overwriting the variable file. I also recommend changing the extension of the file. Try this as your for loop:
for (file in file_list) {
file.xl <- read.xlsx(file)
write.csv(file.xl, file = sub("xlsx$", "csv", file))
}
Note that you'll need to change the "xlsx$" to "xls$" depending on what the extensions are of the files in your directory.
I would like to be able to open files quickly in Excel after saving them. I learned from R opening a specific worksheet in a excel workbook using shell.exec 1 on SO
On my Windows system, I can do so with the following code and could perhaps turn it into a function: saveOpen <_ function {... . However, I suspect there are better ways to accomplish this modest goal.
I would appreciate any suggestions to improve this multi-step effort.
# create tiny data frame
df <- data.frame(names = c("Alpha", "Baker"), cities = c("NYC", "Rome"))
# save the data frame to an Excel file in the working directory
save.xls(df, filename "test file.xlsx")
# I have to reenter the file name and add a forward slash for the paste() command below to create a proper file path
name <- "/test file.xlsx"
# add the working directory path to the file name
file <- paste0(getwd(), name)
# with shell and .exec for Windows, open the Excel file
shell.exec(file = file)
Do you just want to create a helper function to make this easier? How about
save.xls.and.open <- function(dataframe, filename, ...) {
save.xls(df, filename=filename, ...)
cmd <- file.path(getwd(), filename)
shell.exec(cmd)
}
then you just run
save.xls.and.open(df, filename ="testfile.xlsx")
I guess it doesn't seem like all that many steps to me.