Automatic loading of data from sysdata.rda in package - r

I have spent a lot of time searching for an answer to what is probably a very basic question, but I just can't find the solution to my issue. The closest that I found was this exchange from a few years ago.
In that case, the issue was the location of the sysdata.rda file in the correct directory within the package. That is not my issue.
I have some variables that store things like color palettes that I amusing inside a package. These variables are only used inside my functions so I storing them in R/sysdata.rda. However, when I load the packages, the variables are not loading into the package environment. If I load the data manually from sysdata.rda then everything works fine.
My impression from reading everything that I could find on internal data in R packages was that the data in R/sysdata.rda would load automatically.
Here is the code that I am using to store my data.
devtools::use_data(tmpBrks, tmpColors, prcpBrks, prcpChgBrks,
prcpChgBrkLabels, prcpColors, prcpChgColors,
internal = TRUE, overwrite = TRUE)
That successfully creates the data file at R/sysdata.rda and the data is in the file when I load it manually.
What do I need to do to have the data load automatically so the functions in my package can use them?

As usual, this was a bad combination of user ignorance and poor R documentation. The data was being loaded and was available to the functions. Where I went wrong was in assuming that the data would be visible in the package environment. That is not the case.
As far as I can tell, internal data in the R\sysdata.rda file is available to the functions within the package, but not visible in any way. After I created the internal data file I was looking for the data in the package environment. When I didn't see it I assumed that it wasn't loaded. When I kept pushing forward with my package development I finally realized that the data was loading silently and accessible to the functions in the package.
As evidenced by the two up votes that my question got, I am not the only one who didn't understand the behavior of the R\sysdata.rda internal data. Hopefully this explanation will save someone else a bunch of time searching for an answer to this issue that doesn't really exist.

Related

Global variable on load in R package

I'm writing a package in R and I'd like to include some sample objects in it that would be easily accessible for users to play with. The problem is they contain non-ASCII characters, and R CMD check won't allow this in .rda files in data. It will, however, allow Unicode in inst/extdata. I could just have these datasets read and wrapped in objects when the package is loaded. I tried assign and <<-, but I couldn't make either work.
Alternately, they could be loaded and saved as .rda files during the installation of the package. This would be preferable, in fact, but from what I read this seemed to be less possible.
Probably irrelevant but possibly interesting bit of history: I started the package on Debian unstable. I saved those datasets as .rda and they passed the check just fine. At one point I made a little correction, resaved them, and got a warning. I saved them again, and the warning disappeared. Then I moved to Debian stable, added some new datasets, resaved them all, and now I can't get rid of the warning in any way. When I save them from r-devel, however, I only get a note, not a warning.
The answer is embarrassingly simple: read the data and prepare the variables in one of the files in the R folder, and #' #export them. No need to assign or anything.

R Package Build/Install Error: "object not found" even though I have it in R/sysdata.rda

Similar Question
accessing sysdata.rda within package functions
Why This Similar Question Does Not Apply To Me
They were able to actually build it, and apparently it was a Github error for them (not related)
R VERSION
3.4.2 (I tried using 3.4.3 as well but the same problem occurred)
EDIT: I am on Windows 10
Context
I have fully read the following tutorial on R packages and how to include .Rda files in them. I have LazyData in my DESCRIPTION file set as true as well. I have tried both the data/ folder implementation and the R/sysdata.rda implementation using the function devtools::use_data() with the respective options of internal = FALSE and internal = TRUE.
However, when I try to build the package, or use devtools::install (which builds as well I assume), it fails and gives me the following error message:
Error in predict(finalModel, newInput) : object 'finalModel' not found
Where finalModel is stored within my .rda file.
Does anyone know any possible reasons why this might occur?
I also asked a coworker to install the package on his machine, but unfortunately he got the exact same error.
I made another test package as a 'sanity-check' by creating a simple linear model using the lm() function on datasets::swiss, and then made a test package with this newly created model as a .rda file. When I referenced this test model in a function within this test package, it eerily worked, despite the fact that (to the best of my knowledge) I used the exact same steps to create this new R package.
Also, I unfortunately cannot share the code for the package I am creating, but I am willing to share the code for the test package that uses the swiss dataset.
Thank you in advance.
EDIT: My .rda file I am putting in the package was created last year, if that has anything to do with it.
I just solved a similar issue of having object 'objectName' not found that arose during package management. In my case, the issue related to losing the context of variables when working with parallelization.
When using parallel::clusterExport(cl, varlist=c("function-name")), clusterExport looks at .GlobalEnv for variable definitions. This wouldn't come up during debugging, as I always the variables defined in .GlobalEnv. The solution was to state the environment explicitly: parallel::clusterExport(cl, varlist=c("function-name"), envir=environment()). This ensures that the parallel processes have context of the variables within the data/ folder and R/sysdata.rda.
Source
If you have more than one internal file, you must save them together:
usethis::use_data(file_1,
file_2,
file_3,
internal = TRUE,
overwrite = TRUE)

R function example requires nonstandard dataset, doesn't jive with devtools

I've been struggling to get the example code for a function working using devtools::check(), because the data required for the example is not in .RData format. Unfortunately, the way the function is written, .RData cannot be loaded and work properly. The function takes in a list of filenames and performs an action on them collectively.
Therefore, example code must be written in a way that check() is able to access a folder and list the files therein. Using the function on my own computer, I input
setwd("/Users/mydirectory")
myfilelist <- list.files(pattern = "mypattern")
output <- myfunction(myfilelist, ...)
and everything is groovy. But this doesn't work with devtools because #examples doesn't know how to access subdirectories on my computer. check() pulls the following error:
base::assign(".ptime", proc.time(), pos = "CheckExEnv")
This is almost undoubtedly because check() doesn't know where to look for the data. I'd like it to look toward github to access the online data repository.
I found this brief conversation regarding a similar roxygen-related problem, but overall I haven't seen much advice on how to work through it. I think that perhaps this issue starts to get a little closer to my situation, but here the user failed to export a function, rather than bind data to an example.
I don't think I'm looking for a pull function (though the end goal is to pull data...), does anyone have advice moving forward? I have the data stored in the inst/extdata folder on github, so while I don't really have something reproducible for you all I'm hoping you might have some thoughts.
Edit: I worked around the problem using #alistaire's advice below, and guiding the roxygen to the package directory (updated on github) and also using \dontrun{}. However, I am leaving the question unanswered for now because I think accessing data stored in github should still be somehow possible and we haven't yet addressed that.

Temporarily loading and unloading packages in an R function

I am writing a function that will take the name of an installed package and return a data frame listing all the data frames available in that package along with the number and types of variables in those data frames.
In order to do this, I need to require the package temporarily so I can access its data sets. The problem I have is that requiring a package also introduces a whole lot of extra stuff into the search path and the loaded namespaces beyond just the package in question. I want my function to tidy up after itself, but I can't find a good way to detach everything that got imported when the package was required. In particular, detach seems to detach only the package, but not any of the other imported stuff.
Any advice?
I'm not sure what IDE you are workign with but many of them have "tab-completion". If I type :....... ?unload at my console and hit <tab> I immmediately see ??unloadNamespace ... so that would be a reasonable function to investigate. You should first look at:
?unloadNamespace
... and then decide if that is sufficient. There is also the detach function that has a link to its associated help page in that help page.

Can I load an RData file while bypassing loading the namespaces?

Let's say some of my users cannot alter their R environments, but I need them to be able to open up RData files. These environment files require a package to be loaded (httpuv to be exact). We don't care about the package, we don't need its capabilities, we just need to get at the data. Is there a way to either force R to bypass loading namespaces when loading the RData file, or force it to save it without namespace dependencies at the originating end? Thanks.
To reproduce, install Shiny. Create and save a some R objects to the server's file system from within a Shiny applet as an RData file. Copy the file over to a computer that doesn't have Shiny or the httpuv package installed. Try loading the RData file, even if the actual objects you saved are completely ordinary data.frames that have nothing to do with Shiny or httpuv.
I did strings on the RData, and the damn thing is full of references to httpuv. The software is loading the file and then actively deciding to not continue in the internal loadFromConn2() function. Therefore there must be a way to make it stop doing so.
Really #baptiste should get credit for the link in his comment to some general solutions, especially the R CMD INSTALL --fake trick, and I will accept that if he reposts it as an answer. That is why I am not accepting the following answer of my own to the specific problem that caused it in my case, but I am posting my answer in case it helps someone else.
Some of the objects I was saving were lm fitted objects. Those contain formula/terms objects (at least two each, for some reason... maybe because they've been through stepAIC), and those formulas in turn each have an environment attribute. The environment attribute is .GlobalEnv which probably does contain copies of package functions someplace. When I dug through the objects inside the fitted models, and then the objects inside all the attributes of those objects, and then the objects inside the attributes of the attributes of those objects... and set every environment attribute I could find to NULL, eventually I was able to save that fitted model to a file that could be opened from a different R installation without getting the error about not being able to load a namespace.
I suppose I could also write a function to iterate through the objects within a fitted model, and their attributes, and remove environments but that sounds ugly and dangerous. Maybe there is a way to force formulas and fitted models not to retain environments, and that will be better. For the time being, instead of saving fitted models, I will save their call attributes after scrubbing any environment attributes I might find there. If that doesn't work, I'll deparse them into character strings.
PS: I used the RDS format and haven't yet tested it with RData, but I suspect that the problem was the saving of the evalution environment in some of the attributes, and had nothing to do with the format in which the objects get saved. I'll post an update if it turns out that this doesn't also work with RData.
PPS: I suspect I'm not the only one here who's hearing about the R CMD INSTALL --fake trick for the first time, and perhaps the word should be spread about this... because to the extent other R users don't know about it, this remains an obvious vector for denial-of-service attacks against R!
I will accept my own answer to get rid of the SO auto-nagger, but will unaccept it and accept #baptiste if they make it possible for me to do so by posting it as an answer. Thanks.

Resources