How a function can read a static csv file inside its package - r

I am developing an R package and some funcions need to read an static .csv file inside the package, using read.csv function.
I red some text about this
http://r-pkgs.had.co.nz/data.html
http://tinyheero.github.io/jekyll/update/2015/07/26/making-your-first-R-package.html
They recommend to save the files in inst/extdata. But I still didnt get it. inst/extdata is a folder inside my package? Because I want my functions read the .csv file using the read.csv function, not the system.file function
Some help here would be nice

system.file() provides the path to the file in the package. If I want to pull data from a csv file that's included in a package, I can use:
data <- system.file("extdata", "datafile.csv", package = "mypackagename")
data_df <- read.csv(data)

Related

Importing .xls file that is saved as *.htm, *.html as it is saved on the backend

I have a requirement where I have to import an .xls file which is saved as .*htm, .*html.
How do we load this inside R in a data frame. The data is present in Sheet1 starting from Row Number 5. I have been struggling with this by trying to load it using xlsx package and readxl package. But neither of them worked, because the native format of the file is different.
I can't edit and re-save the file manually as .xlsx, as it cannot be automated.
Also to note, saved it as a .xlsx file and it works fine. But that's not what I need.
Kindly help me with this.
Try the openxlsx package and its function read.xlsx. If that doesn't work, you could programmatically rename the file as described for example here, and then open it using one of these excel packages.
Your file could be in xls format instead of xlsx, have you tried read_xls() function from readxl? Or it could also be in text format, in this case read.table() or fread() from data.tableshould work. The fact that it works after saving the file in xlsx strongly suggests that it is not formatted as an xlsx to begin with.
Hope this helps.

How to load xlsx file using fread function?

I wanted to use fread function to load all the datasets as I think it would better to use one type of import function so I just sticked to the fread.
Few of my files are in xlsx format and I was saving them to csv format and then using the fread function was trying to load the datasets.
But I noticed that when I converted the xlsx files into csv, an empty or incomplete row was being created in the newly created csv files.
Is there a way I can resolve this issue? Can I load xlsx file somehow using the fread function rather than converting it to csv file and then loading it using the fread function?
Here's how: Using command line tools directly in conjunction with csvkit like this
my.dt<-fread('in2csv my.xls')

Append new lines to a .Rda file in R

Writing a fresh .Rda file to save a data.frame is easy:
df <- data.frame(a=c(1,2,3,4), b=c(5,6,7,8))
save(df,file="data.Rda")
But is it possible to write more data afterwards, there is no append=TRUE option using save.
Similarly, writing new lines to a text file is easy using:
write.table(df, file = 'data.txt', append=T)
However for large data.frames, the resulting file is much larger.
If you use Microsoft R, you might want to check RevoScaler package, rxImport function in particular. It allows you to store compressed data.frame in file, it also allows you to append new lines to existing file without loading it into environment.
Hope this helps. Link on function documentation below.
https://learn.microsoft.com/en-us/machine-learning-server/r-reference/revoscaler/rximport

How to add external data file into developing R package?

I am building my R packages in Rstudio, I ran into some unexpected problem when I tired to create package' vignette. when I hit build/load panel in Rstudio, I got vignette error, while package's documentation was created. To possibly solve vignette error I got, I have to add external data to my packages, use this data to compile package vignette accordingly. I used devtools::install() command to install my packages, but inst/ directory is not created. extdata must be located in inst directory. I also used devtools::use_data() to add my data from my PC, but I can't able to add my external data. How can I load external data for my packages ? I think I should not manually create extdata and put external data over there. Why inst/ was not created when I used devtools::install() ? How to add set of csv files as external data into my packages ?
This is the toy helper function I am going to use in my vignette to read external data :
myFunc <- function(myDir, ...) {
files <- list.files(myDir, full.names = TRUE, "\\.csv$")
readMe <- lapply(files, read.csv)
return(readMe)
}
This is the first time I build R packages, getting some common error. My apology if my questions is not well stated.
to find files in inst/, I need to use system.file(), but I don't have this directory, plus myFunc accept file directory to to grab the files and read them as .csv, this is toy code chunk could be executed in vignette file :
```{r}
library(myPkg)
file.1 <- system.file("extdata", "xxx.csv", "myPkg")
file.2 <- system.file("extdata", "yyy.csv", "myPkg")
myFunc(list(file.1, file.2))
```
How can I load external data to my packages in order to compile package vignette by using this data ? Why inst/ not created when I hit devtools::install() ? Can anyone help me how to do this ?Thanks in advance :)
You should manually create inst/extdata/file.csv in the base directory for your project (where DESCRIPTION is). You can put all the files you want to access in that directory.
Then to get the files in function examples or your vignette:
files <- lapply(list.files(system.file('extdata', package = 'my_package'), full.names = TRUE), read.csv)
system.file() returns the path to the extdata folder, then list.files() will create a vector of all the files in extdata. Finally, running lapply() with read.csv() should read the contents of all the files into a single list for you.

How to read .xlsx file using XLConnect in R

I want to read an .xls or .xlsx file from my hard drive using R. I installed the XLConnect package and have received the following errors:
Data <- readWorksheet(loadWorkbook("C:/test1.xlsx"),sheet=1)
Error: FileNotFoundException (Java): File 'test1.xlsx' could not be found - you may specify to automatically create the file if not existing.
I want to read the first tab of my Excel file. I also tried the gdata read.xls function and failed.
Try to define your working directory before calling the xlsx file. So use the function setwd before calling the file. Example:
setwd("the location where the file is placed on your pc")
Data <- readWorksheet(loadWorkbook("C:/test1.xlsx"),sheet=1)
Note: make sure u are using forward slashes instead of backwards slashes in the setwd function.

Resources