Storing geometry data in r packages - r

I read somewhere that creating custom functions and build them into packages is the best way to avoid repeating code and keeping several scripts, so i'm giving it a try.
devtools::install_github("https://github.com/albersonmiranda/desigualdade")
Here i'm trying to setup functions to shortcut some map plots.
REPRODUCING
In data-raw Github repo folder, which i removed from .gitignore only for this thread, there's a script named data.R. Running it will give you an object named censo_des. It's a tibble with Brazil's 2010 census and municipalities geometry data.
I can then load the functions in /R folder and run RM("RJ", n.nomes = 2) for plotting Rio de Janeiro's map, with geom_label() showing names for top 2 mucipalities of higher and lower per capita income for black and white people.
QUESTION
Running usethis::use_data(censo_des, overwrite = TRUE) to create .rda file and store it in /data folder seems to change censo_des structure and it's no longer a tibble. When i install the package, if i try to run censo_des i get the error message Error: Input must be a vector, not a sfc_GEOMETRY/sfc object. and if i try RM("RJ") i get Error: Can't slice a scalar.
I guess my issue is how to proper store sfc_GEOMETRY data in packages. Any thoughts?

Turns out you need to import sf package to handle geometry data. It isn't a regular tibble or data frame. Once i imported, everything worked fine.

Related

Missing command in an R package

So to get to the point: I need to use an R package called machuruku. To get familiar with the package I used the dataset provided in the original paper (https://academic.oup.com/sysbio/article/70/5/1033/6171196). While trying to run the code for the simulation I get an error message saying that the command "machu.simulation" doesn't exist. Any of you have any idea why that's happening? Am I missing a package?
I downloaded the dataset zip file, dove into the second nested zip file Guillory_and_Brown_simulation-validation.zip, then into its file code_simulation-validation.R, and noticed that this source file uses machu.simulation several times before defining the function starting in line 519.
Suggestions:
Grab lines 519 through the end, save into a different file, source that new file, then try to run the code in the beginning of the file again.
Complain (not quietly?) to the authors, the fact that they think this is reproducible means they might have missed something else, too.

Datasets not exported/available from my R package

Following advice about NAMESPACE and External Data formatting/setup, I have:
A. My data files in mypackage/data/datafilename.RData
B. The data script as mypackage/R/data.R with data files individually named and described within that one file, having just changed "itemize" to "describe" and changing the format of those item lines:
C. I've document()-ed this, commit-pushed to github, and install_github reinstalled locally.
Help for the data files works:
But I can't access those data, whereas I can access data in other packages using the same method:
Can anyone think why this would be? NAMESPACE doesn't include these as exports:
But it's autogenerated by document() so that's arguably out of my control. By comparison, mapplots' NAMESPACE has exportPattern(".")
Environment for the package also doesn't include them, but I don't know if this is expected or not, based on lazy loading (which is true):
Any ideas welcome. I've tried data(gbm.auto:grids) with 1, 2 & 3 colons, to no avail. Based on the answer to this related question (also by me), I get the suspicion that there might be some issue whereby only the last named object in data.R is important/accessible?
usethis has been created since I've been updating this package and has use_data and create_package but I'm reluctant to try these out since ostensibly everything in my package should already be in order and I don't want to make things worse.
Thanks in advance. Reprex would be
library(devtools)
install_github("SimonDedman/gbm.auto")
Edit: to add to this, the datasets available in the installed package are a combo of the full list, some individual, some named in datalist:
Which contrasts against what's in the working folder and github:
As far as I can see, all the data files are the same format, e.g. when doubleclicked in file explorer they open in RStudio with the right name and same format. gbm.auto/R/data.R file is here. Per the last image, the three data files listed in datalist can be loaded in R with library(gbm.auto) data(Juveniles), but the other three data files can't. If I delete/rename the existing datalist from /data and generate a new one with add_datalist(pkgname = getwd()), a new file is generated but again it only lists those 3 files, not all 6.
Ugh, goddamn it. Found the issue. The 3 'bad' files had "Rdata" extensions while the 3 good ones had "RData" extensions. Lower case vs capital D. How unbelievably annoying.
Data files in data must have .RData extensions, not .Rdata
Bug filed here.

How to include raw data in an R package

I'm working on the final assignment of the course Building R Packages.
In this assignment, we need to create an R package based on some example functions provided by the instructors. We need to organize and document the package, then make it available on GitHub. My package is called FARS and is already available in this GitHub repo.
I'm having trouble with making raw data available with the package. After following the instructions provided in the course's readings and also in chapter 14.3 of the book Building R Packages, the files are still not being recognized.
What did I do so far?
Prepared all the package's documentation, including roxygen2 tags, DESCRIPTION, README.Md, and vignette, following these steps in addition to instructions provided in the readings and book mentioned;
Created a subdirectory named inst/extdata in the package's directory;
Copied all three example files (.csv.bz2) with raw data to inst/extdata;
Tested the functions using testthat;
Installed my FARS package.
Now I'm trying to check if one of the files is available after installing the package:
system.file("extdata", "accident_2013.csv.bz2",
package = "FARS",
mustWork = TRUE)
I get an error message:
Error in system.file("extdata", "accident_2013.csv.bz2", package = "FARS", :
no file found
These data files need to be available with the package, so the examples provided in the vignette work properly.
Here's a "real-life" example, using a simple package I wrote recently.
I have a "data" directory in the build directory.
EDIT To clarify the comments found in R-exts, the directory tree packagename/inst/extdata is intended for data that your functions call directly, by specifying that directory path. Since you want to load data into your workspace, use the data directory.
My "data" directory contains one file named preciseNumbersAsChar.r . That file contains assignments such as
charE <- {long number string}
If you read the help page for the command data, it explains that files ending in .r are sourced when called.
library(FunWithNumbers)
data('preciseNumbersAsChar') #works
Which is to say, the defined objects are now in my environment.
It's worth reading the help page for data in detail as different file types are handled slightly differently.

Global variable on load in R package

I'm writing a package in R and I'd like to include some sample objects in it that would be easily accessible for users to play with. The problem is they contain non-ASCII characters, and R CMD check won't allow this in .rda files in data. It will, however, allow Unicode in inst/extdata. I could just have these datasets read and wrapped in objects when the package is loaded. I tried assign and <<-, but I couldn't make either work.
Alternately, they could be loaded and saved as .rda files during the installation of the package. This would be preferable, in fact, but from what I read this seemed to be less possible.
Probably irrelevant but possibly interesting bit of history: I started the package on Debian unstable. I saved those datasets as .rda and they passed the check just fine. At one point I made a little correction, resaved them, and got a warning. I saved them again, and the warning disappeared. Then I moved to Debian stable, added some new datasets, resaved them all, and now I can't get rid of the warning in any way. When I save them from r-devel, however, I only get a note, not a warning.
The answer is embarrassingly simple: read the data and prepare the variables in one of the files in the R folder, and #' #export them. No need to assign or anything.

Automatic loading of data from sysdata.rda in package

I have spent a lot of time searching for an answer to what is probably a very basic question, but I just can't find the solution to my issue. The closest that I found was this exchange from a few years ago.
In that case, the issue was the location of the sysdata.rda file in the correct directory within the package. That is not my issue.
I have some variables that store things like color palettes that I amusing inside a package. These variables are only used inside my functions so I storing them in R/sysdata.rda. However, when I load the packages, the variables are not loading into the package environment. If I load the data manually from sysdata.rda then everything works fine.
My impression from reading everything that I could find on internal data in R packages was that the data in R/sysdata.rda would load automatically.
Here is the code that I am using to store my data.
devtools::use_data(tmpBrks, tmpColors, prcpBrks, prcpChgBrks,
prcpChgBrkLabels, prcpColors, prcpChgColors,
internal = TRUE, overwrite = TRUE)
That successfully creates the data file at R/sysdata.rda and the data is in the file when I load it manually.
What do I need to do to have the data load automatically so the functions in my package can use them?
As usual, this was a bad combination of user ignorance and poor R documentation. The data was being loaded and was available to the functions. Where I went wrong was in assuming that the data would be visible in the package environment. That is not the case.
As far as I can tell, internal data in the R\sysdata.rda file is available to the functions within the package, but not visible in any way. After I created the internal data file I was looking for the data in the package environment. When I didn't see it I assumed that it wasn't loaded. When I kept pushing forward with my package development I finally realized that the data was loading silently and accessible to the functions in the package.
As evidenced by the two up votes that my question got, I am not the only one who didn't understand the behavior of the R\sysdata.rda internal data. Hopefully this explanation will save someone else a bunch of time searching for an answer to this issue that doesn't really exist.

Resources