How to write data into a macro-enabled Excel file (write.xlslx corrupts my document)? - r

I'm trying to write a table into a macro-enabled Excel file (.xlsm) through the R. The write.xlsx (openxlsx) and writeWorksheetToFile (XLconnect) functions don't work.
When I used the openxlsx package, as seen below, the resulting .xlsm files ended up getting corrupted.
Code:
library(XLConnect)
library(openxlsx)
for (i in 1:3){
write.xlsx(Input_Files[[i]], Inputs[i], sheetName="Input_Sheet")
}
#Input_Files[[i]] are the R data.frames which need to be inserted into the .xslm file
#Inputs[i] are the excel files upon which the tables should be written into
Corrupted .xlsm file error message after write.xlsx:
Excel cannot open the file 'xxxxx.xslm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file
After researching this problem extensively, I found that the XLConnect connect package offers the writeWorksheetToFile function which works with .xlsm, albeit after running it a few times it yields an error message that there is no more free space. It also runs for 20+ minutes for tables with approximately 10,000 lines. I tried adding xlcFreeMemory at the beginning of the for loop, but it doesn't solve the issue.
Code:
library(XLConnect)
library(openxlsx)
for (i in 1:3){
xlcFreeMemory()
writeWorksheetToFile(Inputs[i], Input_Files[[i]], "Input_Sheet")
}
#Input_Files[[i]] are the R data.frames which need to be inserted into the .xslm file
#Inputs[i] are the excel files upon which the tables should be written into
Could anyone recommend a way to easily and quickly transfer an R table into an xlsm file without corrupting it?

Related

Optimum way to overwrite an xlsx worksheet

I'm trying to write an Excel worksheet with the XLConnect package. The data I'm using is a data.frame (820*132). Once I'm done building the dataset, I'm using the writeWorksheetToFile function to export.
If the file does not exist yet and I am creating it from scratch, everything works well.
If I want to overwrite an existing sheet, the function takes approximately a minute to write and in addition, when I open the excel file, I have an error message saying: "we found a problem with some content in 'my_file.xlsx. Do you want to try to recover as much as we can?"
I tried to use other packages to write to excel like xlsx and openxlsx but they do not allow to overwrite a sheet without overwriting the entire workbook.
I've checked a few solutions such as this, but I not optimal.
I am looking for the most optimal way of writing excel worksheets, with an overwrite option that is suitable for large datasets.
I'm using the latest versions of R and RStudio.
My Excel verion is 1902, 64bits.

Importing .xls file that is saved as *.htm, *.html as it is saved on the backend

I have a requirement where I have to import an .xls file which is saved as .*htm, .*html.
How do we load this inside R in a data frame. The data is present in Sheet1 starting from Row Number 5. I have been struggling with this by trying to load it using xlsx package and readxl package. But neither of them worked, because the native format of the file is different.
I can't edit and re-save the file manually as .xlsx, as it cannot be automated.
Also to note, saved it as a .xlsx file and it works fine. But that's not what I need.
Kindly help me with this.
Try the openxlsx package and its function read.xlsx. If that doesn't work, you could programmatically rename the file as described for example here, and then open it using one of these excel packages.
Your file could be in xls format instead of xlsx, have you tried read_xls() function from readxl? Or it could also be in text format, in this case read.table() or fread() from data.tableshould work. The fact that it works after saving the file in xlsx strongly suggests that it is not formatted as an xlsx to begin with.
Hope this helps.

Importing to R an Excel file saved as web-page

I would like to open an Excel file saved as webpage using R and I keep getting error messages.
The desired steps are:
1) Upload the file into RStudio
2) Change the format into a data frame / tibble
3) Save the file as an xls
The message I get when I open the file in Excel is that the file format (excel webpage format) and extension format (xls) differ. I have tried the steps in this answer, but to no avail. I would be grateful for any help!
I don't expect anybody will be able to give you a definitive answer without a link to the actual file. The complication is that many services will write files as .xls or .xlsx without them being valid Excel format. This is done because Excel is so common and some non-technical people feel more confident working with Excel files than a csv file. Now, the files will have been stored in a format that Excel can deal with (hence your warning message), but R's libraries are more strict and don't see the actual file type they were expecting, so they fail.
That said, the below steps worked for me when I last encountered this problem. A service was outputting .xls files which were actually just HTML tables saved with an .xls file extension.
1) Download the file to work with it locally. You can script this of course, e.g. with download.file(), but this step helps eliminate other errors involved in working directly with a webpage or connection.
2) Load the full file with readHTMLTable() from the XML package
library(XML)
dTemp = readHTMLTable([filename], stringsAsFactors = FALSE)
This will return a list of dataframes. Your result set will quite likely be the second element or later (see ?readHTMLTable for an example with explanation). You will probably need to experiment here and explore the list structure as it may have nested lists.
3) Extract the relevant list element, e.g.
df = dTemp[2]
You also mention writing out the final data frame as an xls file which suggests you want the old-style format. I would suggest the package WriteXLS for this purpose.
I seriously doubt Excel is 'saved as a web page'. I'm pretty sure the file just sits on a server and all you have to do is go fetch it. Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for use downloader, and ty something like this.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")

R Copying to and Reading from csv Files

When I go to save Excel data that I've pasted into a .csv file, I get a formatting issue and often the saved file has all the numbers in each row as one long string.
My read statement is
resids<-read.csv("C:\\Projects\residuals_Parts3.csv",header=TRUE)
Any ideas on how to fix this?
The warning you are getting is fairly standard in Excel - any formatting you've added to the file (e.g. widening columns) will get lost if you don't save the file as an excel file.. and the warning is supposed to remind you of this. Personally, the extra click or two annoys me too.
If you would like to avoid converting excel files to CSV before bringing them into R, try the openxls package. It's saved me from a lot of that monkey business.

Error: Invalid: File is too small to be a well-formed file - error when using feather in R

I'm trying to use feather (v. 0.0.1) in R to read a fairly large (3.5 GB) csv file with 21178665 rows and 16 columns.
I use the following lines to load the file:
library(feather)
path <- "pp-complete.csv"
df <- read_feather(path)
But I get the following error:
Error: Invalid: File is too small to be a well-formed file
There's no explanation in the documentation of read_feather so I'm not sure what's the problem. I guess this function expects a different file form but I'm not sure what that would be.
Btw, I can read the file with read_csv in readr library but it takes a while.
The feather file format is distinct from a CSV file format. They are not interchangeable. The read_feather function cannot read simple CSV files.
If you want to read CSV files quickly, your best bets are probably readr::read_csv or data.table::fread. For large files, it will still usually take a while just to read it from disc.
After you've loaded the data into R, you can create a file in the feather format with write_feather so you can read it with read_feather the next time.

Resources