R: read xlsx not reading excel file when rows surpass 100k

R: read xlsx not reading excel file when rows surpass 100k - r

Looks like when I try to read more than 100k rows from an excel file, I get this error message:
Error: Cell references aren't uniformly A1 or R1C1 format:
df <- read_xlsx("Test.xlsx",
col_names=T,sheet="Data",
range="G1:AL170000")
If I try to read under 100k rows, it does it fine. What am I doing wrong? Any ideas?

Hi I came across the exact same problem and found this solution in the RStudio Community and see also
For your problem I would go with:
df <- read_xlsx("Test.xlsx",
col_names=T,sheet="Data",
range=cell_limits(c(1,6), c(NA,38)))

Related

Deleting excess rows from csv files in R and then combining csv files

I have over 200+ csv files containing temperature data from iButton data loggers. The csv files that are created in onewireviewer have 14 rows of data that I need to get rid of in all of the csv files (see the image below) so that I can then merge csv files based on the column headings.
Onewireviewer output csv file
Id love to be able to automate it in some way as I have around 70 folders (basically one folder per location) with 2-3 csv files from onewireviewer in each folder.
Ive tried messing around with bits of code Ive found online but I couldnt get anything to work and Im now just incredibly frustrated. Any and all help is greatly appreciated!
If it helps I did try running the tidy verse code found here Remove certain rows and columns in multiple csv files under the same folder in R but I get this error:
Column specification -----------------------------------------------------
Delimiter: ","
chr (1): 1-Wire/iButton Part Number: DS1921G-F5
i Use spec() to retrieve the full column specification for this data.
i Specify the column types or set show_col_types = FALSE to quiet this message.
Error: Can't subset columns that don't exist.
x Locations 2, 3, 4, 5, 6, etc. don't exist.
i There are only 1 column.
Run rlang::last_error() to see where the error occurred.
In addition: Warning messages:
1: One or more parsing issues, see problems() for details
2: One or more parsing issues, see problems() for details

try something like
library(data.table)
rbindlist(lapply(list.files(...), fread, skip = 14, ...), ...)
where ... are function's arguments.
check out ?list.files, ?data.table::fread and ?data.table::rbindlist to find out more about them.

R save() not producing any output but no error

I am brand new to R and I am trying to run some existing code that should clean up an input .csv then save the cleaned data to a different location as a .RData file. This code has run fine for the previous owner.
The code seems to be pulling the .csv and cleaning it just fine. It also looks like the save is running (there are no errors) but there is no output in the specified location. I thought maybe R was having a difficult time finding the location, but it's pulling the input data okay and the destination is just a sub folder.
After a full day of extensive Googling, I can't find anything related to a save just not working.
Example code below:
save(data, file = "C:\\Users\\my_name\\Documents\\Project\\Data.RData", sep="")

Hard to believe you don't see any errors - unless something has switched errors off:
> data = 1:10
> save(data, file="output.RData", sep="")
Error in FUN(X[[i]], ...) : invalid first argument
Its a misleading error, the problem is the third argument, which doesn't do anything. Remove and it works:
> save(data, file="output.RData")
>
sep is used as an argument in writing CSV files to separate columns. save writes binary data which doesn't have rows and columns.

View function giving error only on one data.frame

This is my first post so I will try to be specific.
I have imported few .csv files and I am trying to combine them together.
When I am inspecting each individual data frame, as I import them, I can open them in RStudio View window and data looks correct.
However once I combine the data frames together using Master<-do.call("rbind", list(DF1,DF2,DF3,DF4)) and try try to view the Master table i get following massage:
Error in if (nchar(col_min_c) >= 16 || grepl("e", col_min_c, fixed =
TRUE) || : missing value where TRUE/FALSE needed
However, when I view all the original data frames I am able to see them with no problem.
If I use utils::View(Master) I am able to see the data frame.
So I am not sure where this issue comes from.
These are the package I am running:
require(data.table)
require(dplyr)
require(sqldf)
require(ggplot2)
require(stringr)
require(reshape2)
require(bit64)
Thanks for any help you can provide

I was able to get around this issue by transforming my table via:
Master<-sqldf("SELECT * FROM 'Master'")
So I hope this helps others in the case they come across a similar issue in the future.

I was able to view the file if I removed NA values from a long numeric column (19 char) on the far left hand side of the table (column 1)

R: Writing data frame into excel with large number of rows

I have a data frame (panel form) in R with 194498 rows and 7 columns. I want to write it to an Excel file (.xlsx) using function res <- write.xlsx(df, output) but R goes in the coma (keeps showing stop sign on the top left of console) without making any change in the targeted file(output). Finally shows following:
Error in .jcheck(silent = FALSE) :
Java Exception <no description because toString() failed>.jcall(row[[ir]], "Lorg/apache/poi/ss/usermodel/Cell;", "createCell", as.integer(colIndex[ic] - 1))<S4 object of class "jobjRef">
I have loaded readxl and xlsx packages. Please suggest to fix it. Thanks.

Install and load package named 'WriteXLS' and try writing out your R object using function WriteXLS(). Make sure your R object is written in quotes like the one below "data".
# Store your data with 194498 rows and 7 columns in a data frame named 'data'
# Install package named WriteXLS
install.packages("WriteXLS")
# Loading package
library(WriteXLS)
# Writing out R object 'data' in an Excel file created namely data.xlsx
WriteXLS("data",ExcelFileName="data.xlsx",row.names=F,col.names=T)
Hope this helped.

This does not answer your question, but might be a solution to your problem.
Could save the file as a CSV instead like so:
write.csv(df , "df.csv")
open the CSV and then save as an Excel file.
I gave up on trying to import/export Excel files with R because of hassles like this.

In addition to Pete's answer I wouldn't recommend write.csv because it takes or can take minutes to load. I used fwrite() (from data.table library) and it did the same thing in about 1-2 secs.
The post author asked about large files. I dealt with a table about 2,3 million rows long and write.data (and frwrite) aren't able to write more than about 1 million rows. It just cuts the data away. So instead use write.table(Data, file="Data.txt"). You can open it in Excel and split the one column by your delimiter (use argument sep) and voila!

Reading large csv file with missing data using bigmemory package in R

I am using large datasets for my research (4.72GB) and I discovered "bigmemory" package in R that supposedly handles large datasets (up to the range of 10GB). However, when I use read.big.matrix to read a csv file, I get the following error:
> x <- read.big.matrix("x.csv", type = "integer", header=TRUE, backingfile="file.bin", descriptorfile="file.desc")
Error in read.big.matrix("x.csv", type = "integer", header = TRUE,
: Dimension mismatch between header row and first data row.
I think the issue is that the csv file is not full, i.e., it is missing values in several cells. I tried removing header = TRUE but then R aborts and restarts the session.
Does anyone have experience with reading large csv files with missing data using read.big.matrix?

It may not be solving your problem directly, but you might find a package of mine filematrix useful. The relevant function is fm.create.from.text.file.
Please let me know if it works for your data file.

Did you check bigmemory PDF at https://cran.r-project.org/web/packages/bigmemory/bigmemory.pdf?
It was clearly described right there.
write.big.matrix(x, 'IrisData.txt', col.names=TRUE, row.names=TRUE)
y <- read.big.matrix("IrisData.txt", header=TRUE, has.row.names=TRUE)
# The following would fail with a dimension mismatch:
if (FALSE) y <- read.big.matrix("IrisData.txt", header=TRUE)
Basically, error means there is a column in the CSV file with row names. If you don't pass has.row.names=TRUE, bigmemory will consider row names a separate column, and without header you'll get mismatch.
I personally found data.table package more useful for dealing with large data set cases, YMMV

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: read xlsx not reading excel file when rows surpass 100k - r

Hi I came across the exact same problem and found this solution in the RStudio Community and see also For your problem I would go with: df <- read_xlsx("Test.xlsx", col_names=T,sheet="Data", range=cell_limits(c(1,6), c(NA,38)))

Related

Deleting excess rows from csv files in R and then combining csv files

R save() not producing any output but no error

View function giving error only on one data.frame

R: Writing data frame into excel with large number of rows

Reading large csv file with missing data using bigmemory package in R

Categories

Resources