I am working with loads of xls and xlsx files at the same time with no easy way to convert them to the same file type.
I am facing issue reading them in because read.xlsx() from "xlsx" package works just fine with xls files but I am getting the Java Out of Memory error when trying to read in xlsx files. I tried to use the following line to extend memories with no success:
options(java.parameters = "-Xmx1000m")
As an alternative option I have tried read.xlsx() from "openxlsx" package but it does not read xls files and the aforementioned two packages are not compatible when loaded at the same time. I faced the same difficulty with the "XLConnect" package where again I face java errors when trying to use "xlsx" and "XLConnect" packages loaded at the same time.
I would be interested what people do to solve situations like this?
You can consider the read_excel function in the readxl package:
read_excel(path, sheet = 1, col_names = TRUE, col_types = NULL, na = "", skip = 0)
You can even specify which sheet in the xlsx file you want to import in, whether the first row consists of column names, as well as the missing value used in the excel files.
Related
i am trying to load an excel file in r studio but each time i run it
Error in read_excel("R/win-library/3.6/IMDB_data.xlsx", sheet = "IMDB_data",
:
could not find function "read_excel" this is displayed.
i have tried changing directory
saving the data to load, in the same as working directory
none of the articles resolve my issues concerned yet
tried changing directory
saving the file in the same place as my working directory
importing through choose directory
setwd("~/R/win-library/3.6")
library(readxl)
IMDB_data <- read_excel("R/win-library/3.6/IMDB_data.xlsx",
sheet = "IMDB_data", skip = 2)
Write R code using data “IMDB_data” to
Load CSV in R by skipping second row.
enter image description here
It seems like your readxl library is not loaded.
Do you get any errors when you run library(readxl)?
Your working folder shouldn't matter and you should probably avoid working in the R's library.
The read_excel command should read the file based on the path provided, but your error is not complaining about the missing file. It's complaining about the missing function.
Lastly, if you set the working directory to ~/R/win-library/3.6, then it would be enough to run the following code (provided your readxl library loaded correctly):
IMDB_data <- read_excel("IMDB_data.xlsx", sheet = "IMDB_data", skip = 2)
In order to install the "XLConnect" in R on my mac, I did the following:
install.packages("rjava")
install("XLConnect")
Then to load an excel Workbook,
loadWorkbook(filename,create=T)
But it gives me
$\color{red}{Error in path.expand(filename) : object 'filename' not found}$
Can someone help. I apologise for my editing, I am new to this.
In my experience, when it comes to R the difference between MAC and Windows are small. You have been presented for XLConnect and open.xlsx. I do not like XLConnect because I have had problems when reading in a lot of excel files as a result of the Java memory capacity. I do like readxl which does the job.
if(!require("readxl")) install.packages("readxl",dependencies = TRUE);library("readxl")
df<-read_excel(path = path, sheet = "sheetname in excel file")
df<-as.data.frame(df) #Because it does not read in as a data frame
I have written a R code to create excel workbook and added the data to it using XLConnect package.
wb <- XLConnect::loadWorkbook(Name,create = TRUE)
and added some data frame to this file. Now, I want to access this XLConnect object wb from xlsx package and do some formatting like adding a border, font, wraptext and alignment on the dataframe inside the file. Is this possible?
Kindly let me know if anything is unclear or need more clarification.
It seems that it is not possible to use XLConnect and xlsx packages in the same R session or at least not in single package. I am using openxlsx package for formatting the excel file.
Until recently I was happily using the r2excel package to save multiple data sheets into excel format using the script:
library("r2excel")
filename <- "r2excel-example1.xlsx"
wb <- createWorkbook(type="xlsx")
sheet <- createSheet(wb, sheetName = "example1")
xlsx.addTable(wb, sheet, head(iris), startCol=2)
saveWorkbook(wb, filename)
Now that I have updated R to version 3.3.3 this package is no longer available:
devtools::install_github("kassambara/r2excel")
Error in loadNamespace(name) : there is no package called ‘curl’*
I have found it VERY difficult to export files to excel using any other packages in the old version. Can anyone recommend a solution for exporting multiple dataframes into a single excel file, that is compatible with R3.3.3?
(saving to CSV is not an option for several reasons including the need for saving multiple sheets in one file).
Thanks
I am trying to load excel worksheets into R using the xlsx package. The files are saved as old 97-2003 worksheets (the endings are .XLS) for newer files the code below worked fine.
df <- read.xlsx(filename,sheetIndex=2)
However, when I try on the older files I get the error message:
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
org.apache.poi.hssf.OldExcelFormatException: The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. POI only supports BIFF8 format (from Excel versions 97/2000/XP/2003)
I know the error has to do with the files being in the older format but I do not know how to solve this. I have too many files to manually update each one.
Any suggestions would be greatly appreciated!
P.S. apologies for not adding a fully reproducible example. I do not know how to attach files to go along with my question.
Package readxl is one way to read Excel files. The advantage is that there is no dependy to Java or other.
Your code would be
library(readxl)
df <- read_excel(path = filepath, sheet =2)
It should work with XLS and XLSX files.
Use excel_sheets(filepath) to get the name of sheets to import and pass them through the sheet arg of read_excel. You can do a loop with that if it helps you.