How to add external data file into developing R package? - r

I am building my R packages in Rstudio, I ran into some unexpected problem when I tired to create package' vignette. when I hit build/load panel in Rstudio, I got vignette error, while package's documentation was created. To possibly solve vignette error I got, I have to add external data to my packages, use this data to compile package vignette accordingly. I used devtools::install() command to install my packages, but inst/ directory is not created. extdata must be located in inst directory. I also used devtools::use_data() to add my data from my PC, but I can't able to add my external data. How can I load external data for my packages ? I think I should not manually create extdata and put external data over there. Why inst/ was not created when I used devtools::install() ? How to add set of csv files as external data into my packages ?
This is the toy helper function I am going to use in my vignette to read external data :
myFunc <- function(myDir, ...) {
files <- list.files(myDir, full.names = TRUE, "\\.csv$")
readMe <- lapply(files, read.csv)
return(readMe)
}
This is the first time I build R packages, getting some common error. My apology if my questions is not well stated.
to find files in inst/, I need to use system.file(), but I don't have this directory, plus myFunc accept file directory to to grab the files and read them as .csv, this is toy code chunk could be executed in vignette file :
```{r}
library(myPkg)
file.1 <- system.file("extdata", "xxx.csv", "myPkg")
file.2 <- system.file("extdata", "yyy.csv", "myPkg")
myFunc(list(file.1, file.2))
```
How can I load external data to my packages in order to compile package vignette by using this data ? Why inst/ not created when I hit devtools::install() ? Can anyone help me how to do this ?Thanks in advance :)

You should manually create inst/extdata/file.csv in the base directory for your project (where DESCRIPTION is). You can put all the files you want to access in that directory.
Then to get the files in function examples or your vignette:
files <- lapply(list.files(system.file('extdata', package = 'my_package'), full.names = TRUE), read.csv)
system.file() returns the path to the extdata folder, then list.files() will create a vector of all the files in extdata. Finally, running lapply() with read.csv() should read the contents of all the files into a single list for you.

Related

How a function can read a static csv file inside its package

I am developing an R package and some funcions need to read an static .csv file inside the package, using read.csv function.
I red some text about this
http://r-pkgs.had.co.nz/data.html
http://tinyheero.github.io/jekyll/update/2015/07/26/making-your-first-R-package.html
They recommend to save the files in inst/extdata. But I still didnt get it. inst/extdata is a folder inside my package? Because I want my functions read the .csv file using the read.csv function, not the system.file function
Some help here would be nice
system.file() provides the path to the file in the package. If I want to pull data from a csv file that's included in a package, I can use:
data <- system.file("extdata", "datafile.csv", package = "mypackagename")
data_df <- read.csv(data)

lapply and readxl Error

I am using readxl and lapply to import multiple .xlsx files into my environment. The following worked perfectly before but now when I try to re-run it, it gives me the following error:
Error in read_fun(path = path, sheet = sheet, limits = limits, shim = shim, :
Evaluation error: zip file 'data.xlsx' cannot be opened.
Code:
setwd("./Data Folder") #set path in order to avoid lapply error (This is what solved it last time I got errors)
Load all "Data Folder" datasets
library(readxl)
file.list <- list.files(path = "./Data Folder", pattern = '*.xlsx')
df.list <- lapply(file.list, read_excel)
I have checked if the path I entered is still correct and I didn't alter it by mistake. I have also tried to open the documents in the folder using excel and there is no problem with the files. Any ideas?
I have figured out the problem. I had two different tabs opened in RStudio, one was a R markdown and the other an R Script. I was trying to run the code in R markdown without realising and so I got the lapply error as the setwd was not saved in R's system.
If anyone has this problem at any point:
check if you are in an R Script
set the folder you are taking the data out from as your home folder
run the entire chunk in markdown at once

How to open .rdb file using R

My question is quite simple, but I couldn't find the answer anywhere.
How do I open a .rdb file using R?
It is placed inside an R package.
I have been able to solve the problem, so I am posting the answer here in case someone needs it in the future.
#### Importing data from .rdb file ####
setwd("path...\\Rsafd\\Rsafd\\data") # Set working directory up to the file that contains
# your .rds and .rdb files.
readRDS("Rdata.rds") # see metadata contained in .rds file
# lazyLoad is the function we use to open a .rdb file:
lazyLoad(filebase = "path...\\Rsafd\\Rsafd\\data\\Rdata", envir = parent.frame())
# for filebase, Rdata is the name of the .rdb file.
# envir is the environment on which the objects are loaded.
The result of using the lazyLoad function is that every database contained in the .rdb file shows up in your variable environment as a "promise". This means that the database will not be opened unless you want it to be.
The way to open it is the following:
find(HOWAREYOU) # open the file named HOWAREYOU
head(HOWAREYOU) # look at the first entries, just to make sure
Edit: readRDS is not part of the process to open the .rdb file, it is just to look at the metadata. The lazyLoad function indeed opens .rdb files.
Posting a slightly more direct answer since I keep Googling to this Q&A when trying to examine .rdb objects inside an R package (in particular the help/package.rdb file) and not seeing the answer clearly enough.
R keeps the help Rd objects for the installed package pkg at help/$pkg.{rdb,rdx}.
We can load these Rd objects into environment e like so:
lazyLoad(
file.path(system.file("help", package=pkg), pkg),
envir = e
)
Note that we can't use system.file("help", pkg, package=pkg) because system.file() requires the file to exist or it returns "", and here we've truncated the .rdb/.rdx extension as required by lazyLoad().
We can skip supplying envir=e, but the objects will be loaded into the global environment (assuming you're running this interactively) and I wanted my default answer to avoid polluting it.
See ?lazyLoad for more.

Including Rnw files within a package

I am writing a package and the sole purpose of this package is to create reports. I am using knit to generate the reports from a .Rnw file. This all happens within a function in the package. e.g.
create_report <- function(data) {
knit2pdf(from = "myreport.Rnw", to = "myreport.tex")
# The Rnw in the knit2pdf function uses the data passed to this function
}
My question is simple. Where within my package folders do I store the .Rnw file? Currently my package has the following folders:
.Rproj.user
data
man
R
I am just not sure where my Rnw scripts should go? Do I need another folder called LaTeX for example? This is like having a separate folder for C++ scripts, for example.
Note, I am not looking to create a vignette. I know how to do this. This package is used to do some data manipulation and then generate a report on the data.
I have tried to lay everything out as clearly as I can as some questions I have asked on here before have been misinterpreted. Please ask if anything is unclear.
To answer this question:
Include the .Rnw files in ./pkgname/inst/latex then when you build the package, the ./latex folder will go to the root level of the package. You can then extract the .Rnw files using system.file("latex", "mytemplate.Rnw", package = "pkgname").

How to point to a directory in an R package?

I am making my first attempts to write a R package. I am loading one csv file from hard drive and I am hoping to bundle up my R codes and my csv files into one package later.
My question is how can I load my csv file when my pakage is generated, I mean right now my file address is something like c:\R\mydirectory....\myfile.csv but after I sent it to someone else how can I have a relative address to that file?
Feel free to correct this question if it is not clear to others!
You can put your csv files in the data directory or in inst/extdata.
See the Writing R Extensions manual - Section 1.1.5 Data in packages.
To import the data you can use, e.g.,
R> data("achieve", package="flexclust")
or
R> read.table(system.file("data/achieve.txt", package = "flexclust"))
Look at the R help for package.skeleton: this function
automates some of the setup for a new source package. It creates directories, saves functions, data, and R code files to appropriate places, and creates skeleton help files and a ‘Read-and-delete-me’ file describing further steps in packaging.
The directory structure created by package.skeleton includes a data directory. If you put your data here it will be distributed with the package.

Resources