How to load xlsx file using fread function? - r

I wanted to use fread function to load all the datasets as I think it would better to use one type of import function so I just sticked to the fread.
Few of my files are in xlsx format and I was saving them to csv format and then using the fread function was trying to load the datasets.
But I noticed that when I converted the xlsx files into csv, an empty or incomplete row was being created in the newly created csv files.
Is there a way I can resolve this issue? Can I load xlsx file somehow using the fread function rather than converting it to csv file and then loading it using the fread function?

Here's how: Using command line tools directly in conjunction with csvkit like this
my.dt<-fread('in2csv my.xls')

Related

Exporting dataframe to an existing excel without using xlsx package R

I need to export multiple dataframes to an existing excel file. I'm working in AWS and as stated in this post Can't upload xlsx library in Amazon Web Service, I can't use write.xlsx from the xlsx package.
Similar functions such as write_xlsx do not have the append function that allows me to export the dataframes in the existing file without overwriting it. Which function could be a substitute for write.xlsx? Thanks in advance! (Sorry for not providing any reproducible example, I didn't come up with any for this particular problem)

R: Exporting massive files within a loop

I have a code that generates several very large dataframes in a loop. Each one has around 300 million rows so I run out of memory before the loop is over. I am trying to export each dataframe once it is constructed within the loop and then remove it to free up space in my R environment before I start constructing the next.
The issue is how to export these very large datasets. I tried using fwrite from the data.table package but when I open the csv file I get an empty csv file called Book1 instead. I also tried saving it as a dta file using write.dta from the foreign package but Stata tells me it is corrupted when I try opening it.
When saving it as .csv with fwrite and opening it with Stata it worked perfectly!

How to write data into a macro-enabled Excel file (write.xlslx corrupts my document)?

I'm trying to write a table into a macro-enabled Excel file (.xlsm) through the R. The write.xlsx (openxlsx) and writeWorksheetToFile (XLconnect) functions don't work.
When I used the openxlsx package, as seen below, the resulting .xlsm files ended up getting corrupted.
Code:
library(XLConnect)
library(openxlsx)
for (i in 1:3){
write.xlsx(Input_Files[[i]], Inputs[i], sheetName="Input_Sheet")
}
#Input_Files[[i]] are the R data.frames which need to be inserted into the .xslm file
#Inputs[i] are the excel files upon which the tables should be written into
Corrupted .xlsm file error message after write.xlsx:
Excel cannot open the file 'xxxxx.xslm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file
After researching this problem extensively, I found that the XLConnect connect package offers the writeWorksheetToFile function which works with .xlsm, albeit after running it a few times it yields an error message that there is no more free space. It also runs for 20+ minutes for tables with approximately 10,000 lines. I tried adding xlcFreeMemory at the beginning of the for loop, but it doesn't solve the issue.
Code:
library(XLConnect)
library(openxlsx)
for (i in 1:3){
xlcFreeMemory()
writeWorksheetToFile(Inputs[i], Input_Files[[i]], "Input_Sheet")
}
#Input_Files[[i]] are the R data.frames which need to be inserted into the .xslm file
#Inputs[i] are the excel files upon which the tables should be written into
Could anyone recommend a way to easily and quickly transfer an R table into an xlsm file without corrupting it?

How a function can read a static csv file inside its package

I am developing an R package and some funcions need to read an static .csv file inside the package, using read.csv function.
I red some text about this
http://r-pkgs.had.co.nz/data.html
http://tinyheero.github.io/jekyll/update/2015/07/26/making-your-first-R-package.html
They recommend to save the files in inst/extdata. But I still didnt get it. inst/extdata is a folder inside my package? Because I want my functions read the .csv file using the read.csv function, not the system.file function
Some help here would be nice
system.file() provides the path to the file in the package. If I want to pull data from a csv file that's included in a package, I can use:
data <- system.file("extdata", "datafile.csv", package = "mypackagename")
data_df <- read.csv(data)

Importing .xls file that is saved as *.htm, *.html as it is saved on the backend

I have a requirement where I have to import an .xls file which is saved as .*htm, .*html.
How do we load this inside R in a data frame. The data is present in Sheet1 starting from Row Number 5. I have been struggling with this by trying to load it using xlsx package and readxl package. But neither of them worked, because the native format of the file is different.
I can't edit and re-save the file manually as .xlsx, as it cannot be automated.
Also to note, saved it as a .xlsx file and it works fine. But that's not what I need.
Kindly help me with this.
Try the openxlsx package and its function read.xlsx. If that doesn't work, you could programmatically rename the file as described for example here, and then open it using one of these excel packages.
Your file could be in xls format instead of xlsx, have you tried read_xls() function from readxl? Or it could also be in text format, in this case read.table() or fread() from data.tableshould work. The fact that it works after saving the file in xlsx strongly suggests that it is not formatted as an xlsx to begin with.
Hope this helps.

Resources