Difference between loading and attaching in [R] - r

In RStudio, when I check and uncheck a package, I see the following commands.
library("ggplot2", lib.loc="~/R/win-library/3.4")
detach("package:ggplot2", unload=TRUE)
Can someone explain what is unload=TRUE does?
Conceptually is there a difference between loading/unloading vs attaching/detaching?

From R's official help pages (see also R Packages - Namespaces):
Anything needed for the functioning of the namespace should be handled at load/unload times by the .onLoad and .onUnload hooks.
For example, DLLs can be loaded (unless done by a useDynLib directive in the ‘NAMESPACE’ file) and initialized in .onLoad and unloaded in .onUnload.
Use .onAttach only for actions that are needed only when the package becomes visible to the user (for example a start-up message) or need to be run after the package environment has been created.
 
attaching and .onAttach
thus means that a package is attached to the user space
aka the global environment
usually this is done via library(pkg)
and you can use normal fun() syntax
 
loading and .onLoad
thus means that package is (in any way) made available to the current R-session
(e.g. by loading/attaching another package that depends on it or by using pkg::fun() syntax the first time)
though you will not find functions in the global environment
you can use pkg::fun()

detach related to package environment(which more related to user)
unload = TRUE related to namespace environment(which more related to other package)
after detach, you can not use any function inside that package directly
but unloadnamespace won't prevent you from calling that function,but the other packages can't use its function directly

Related

devtools::load_all() side effects: C++-class constructor calling base::system.file fails/ returns empty string

Context
I'm developing (for the first time) an R package using Rcpp which implements an interface to another program (maxima). This package defines a C++ class, whose constructor needs to retrieve the path to an initialization script that gets installed with the package (the in-package path is inst/extdata/maxima-init.mac. The path to this script is then used as a parameter to spawn a child process that runs the program.
In order to retrieve the path to the installed initialization script I'm calling the R function base::system.file from within the C++ class constructor definition:
...
Environment env("package:base");
Function f = env["system.file"];
fs::path p(Rcpp::as<std::string>(f("extdata", "maxima-init.mac", Named("package") = "rmaxima")));
std::string utilsDir = p.parent_path().string();
...
# spawn child process using path in utilsDir
My R/zzz.R creates an object of that class when the packages gets attached:
loadModule("Maxima", TRUE)
.onAttach <- function(libname, pkgname) {
"package:base" %in% search()
maxima <<- new(RMaxima)
}
The Problem
I can install.packages(rmaxima) and library(rmaxima) just fine and the package works as expected.
I now want to increase my development efficiency by using devtools::load_all() to avoid having to R CMD build rmaxima, install.packages(rmaxima) and library(rmaxima) each time I want to test changes. However, when calling devtools::load_all() (or similarily devtools::test() (working directory set to package root) my implementation freezes, because the variable utilsDir is empty and therefore the process launching does not return (I guess it keeps waiting for a valid path). I eventually need to manually kill the process. The same thing happens without setting .onAttach()
Apparently devtools::load_all() does not resemble R's default search path on restart. What can I do? Is this the problem or am I missing something else?
Update
I just came across the following notion of in the devtools::load_all() R documentation file which could be a tip in the right direction
Shim files:
‘load_all’ also inserts shim functions into the imports environment of
the loaded package. It presently adds a replacement version of
‘system.file’ which returns different paths from ‘base::system.file’.
This is needed because installed and uninstalled package sources have
different directory structures. Note that this is not a perfect
replacement for base::system.file.
Also I realized, that devtools::load_all() only temporarily installs my package into, but somehow doesn't the files from my inst/
rcst#Velveeta:~$ ls -1R /tmp/RtmpdnvOQg/devtools_install_ee1e82c780/rmaxima/
/tmp/RtmpdnvOQg/devtools_install_ee1e82c780/rmaxima/:
DESCRIPTION
libs
Meta
NAMESPACE
/tmp/RtmpdnvOQg/devtools_install_ee1e82c780/rmaxima/libs:
rmaxima.so
/tmp/RtmpdnvOQg/devtools_install_ee1e82c780/rmaxima/Meta:
features.rds
package.rds
As it turns out devtools provides a solution to exactly this problem.
In short: calling system.file (i.e. from the global environitment and having the devtools package attached) solves the issue. Specifically the modification:
// Environment env("package:base");
// Function f = env["system.file"];
Function f("system.file");
fs::path p(Rcpp::as<std::string>(f("extdata", "maxima-init.mac", Named("package") = "rmaxima")));
std::string utilsDir = p.parent_path().string();
Explanation
base::system.file(..., mustWork = FALSE) returns an empty string if no match is found. devtools::load_all() temporarily installs the packages inside /tmp/ (on my linux machine). The directory structure of the temporary installation differs from the one of the regular installation, i.e. the one created by install.packages(). In my case, most notably, devtools::load_all() does not copy the inst/ directory, which contains the initialization file.
Now, calling base::system.file("maxima-init.mac", package="rmaxima", mustWork=FALSE) naturally fails, since it searches inside the temporary installation. Having devtools attached masks system.file() with devtools::system.file(), which as mentioned above is "... meant to intercept calls to base::sysem.file() " and behaves differently from base::system.file(). Practically, I think this means, that it will search for the package's source directory instead of the temporary installation.
This way, simply calling system.file() from the global environment calls the right function, either from devtools or base, for either the development or user version of the package automatically.
Nonetheless, using ccache additionally (thanks #dirk) substantially speeds up my development workflow.

R: Suppressing renv project startup message

Typically, when starting up an renv project, one gets a message that looks something like this:
* Project '~/path/to/project' loaded. [renv 0.10.0]
I am trying to suppress this message, particularly when non-interactively running a script from this project.
Checking the package help, I noted ?config i.e. User-Level Configuration of renv. Specifically, I found synchronized.check, of which the document states is for controlling how renv lockfile synchronization is checked (this is also outputted to the console). However, I couldn't find how to control the main startup message. I also checked the ?settings but found nothing relevant either.
I've tried fiddling with options and Sys.setenv without luck so far.
So, is it possible to suppress the message, seeing that the renv script activate.R controls how the package itself is loaded?
You are correct that there isn't a specific documented way to configure this in renv. For now, you can set:
options(renv.verbose = FALSE)
before renv is loaded. (You may want to turn it back to TRUE if you want renv to display other messages as part of its normal work.)
You can suppress library startup messages with suppressPackageStartupMessages, e.g.
suppressPackageStartupMessages(library(igraph))
There is also suppressMessages for arbitrary function calls.

Weird issue while testing R package: package `[package name]` found more than once, using the first from [file path]

I'm working on a somewhat complex package (that I unfortunately can't share) that involves a Shiny app, and it an issue has surfaced where I'm getting these warnings when testing:
package [package name] found more than once, using the first from [file path]
In library(testthat) : package ‘testthat’ already present in search()
The first one occurs because I'm using system.file to pull a file in inst that I use for testing.
I've tried to do some debugging with .libPaths and by forcing system.file to go to the .libPaths default with the lib.loc argument, but that doesn't do anything.
I've tried uninstalling the package, which works because then there are not multiple results for find.package.
It seems like testthat adds the current directory to the library paths while it's running, and this creates the issue with system.file and consequently, find.package.
I'm really confused as to what's causing this. I'm combing through the changes I've made and I can't seem to find anything. Any ideas are helpful. I've tried googling this error messages and all that comes up is the source.
The issue here was with changing the option setting for verbose. That caused more output to result from the code, which broke a lot of tests. I hope this helps someone in the future.

Find Library Location of Loaded Namespace

I'm looking to fetch what would be the lib.loc argument to library or loadNamespace, from a currently loaded namespace.
For attached packages this is relatively straightforward:
path.package("stats") # get library location of loaded stats package
However, for a non-attached loaded namespace, the best I can come up with is:
getNamespace(x)[[".__NAMESPACE__"]][["path"]]
which happens to work on my R version, but has absolutely no guarantee of working in the future. I could also temporarily attach the package to use path.package, but that would potentially trigger attach hooks and I'd prefer to avoid that.
Anyone know of an equivalent to path.package for loaded but not attached namespaces?
You can use find.package :
it returns path to the locations where the given packages are found. If lib.loc is NULL, then loaded namespaces are searched before the libraries.

How to use the parallel package inside another package, using devtools?

When running the following code in an R terminal:
library(parallel)
func <- function(a,b,c) a+b+c
testfun <- function() {
cl <- makeCluster(detectCores(), outfile="parlog.txt")
res <- clusterMap(cl, func, 1:10, 11:20, MoreArgs = list(c=1))
print(res)
stopCluster(cl)
}
testfun()
... it works just fine. However, when I copy the two function definitions into my own package, add a line #' #import parallel, do dev_tools::load_all("mypackage") on the R terminal and then call testfun(), I get an
Error in unserialize(node$con) (from myfile.r#7) :
error reading from connection
where #7 is the line containing the call to clusterMap.
So the exact same code works on the terminal but not inside a package.
If I take a look into parlog.txt, I see the following:
starting worker pid=7204 on localhost:11725 at 13:17:50.784
starting worker pid=4416 on localhost:11725 at 13:17:51.820
starting worker pid=10540 on localhost:11725 at 13:17:52.836
starting worker pid=9028 on localhost:11725 at 13:17:53.849
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
What's the root of this problem and how do I resolve it?
Note that I'm doing this with a completely fresh, naked package. (Created by devtools::create.) So no interactions with existing, possibly destructive code.
While writing the question, I actually found the answer and am going to share it here.
The problem here is the combination of the packages devtools and parallel.
Apparently, for some reason, parallel requires the package mypackage to be installed into some local library, even if you do not need to load it in the workers explicitly (e.g. using clusterEvalQ(cl, library(mypackage)) or something similar)!
I was employing the usual devtools workflow, meaning that I was working in dev_mode() all of the time. However, this led to my package being installed just in some special dev mode folders (I do not know exactly how this works internally). These are not searched by the worker processes invoked parallel, since they are not in dev_mode.
So here is my 'workaround':
## turn off dev mode
dev_mode()
## install the package into a 'real' library
install("mypackage")
library(mypackage)
## ... and now the following works:
mypackage:::testfun()
As Hadley just pointed out correctly, another workaround would be to add a line
clusterEvalQ(cl, dev_mode())
right after cluster creation. That way, one can use the dev_mode.

Resources