Attaching a temporary namespace to the search path - r

This question is sort of a follow up to this post as I'm still not fully convinced that, with respect to code robustness, it wouldn't be far better to make typing namespace::foo() habit instead of just typing foo() and praying you get the desired result ;-)
Actual question
I'm aware that this goes heavily against "standard R conventions", but let's just say I'm curious ;-) Is it possible to attach a temporary namespace to the search path somehow?
Motivation
At a point where my package mypkg is still in "devel stage" (i.e. not a true R package yet):
I'd like to source my functions into an environment mypkg instead of .GlobalEnv
then attach mypkg to the search path (as a true namespace if possible)
in order to be able to call mypkg::foo()
I'm perfectly aware that calling :: has its downsides (it takes longer than simply typing a function's name and letting R handle the lookup implicitly) and/or might not be considered necessary due to the way a) R scans through the search path and b) packages may import their dependencies (i.e. using "Imports" instead of "Depends", not exporting certain functions etc). But I've seen my code crash at least twice due to the fact that some package has overwritten certain (base) functions, so I went from "blind trust" to "better-to-be-safe-than-sorry" mode ;-)
What I tried
AFAIU, namespaces are in principle nothing more than some special kind of environment
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
> asNamespace("base")
<environment: namespace:base>
And there's the attach() function that attaches objects to the search path. So here's what I thought:
temp.namespace <- new.env(parent=emptyenv())
attach(temp.namespace)
> asNamespace("temp.namespace")
Error in loadNamespace(name) :
there is no package called 'temp.namespace'
I guess I somehow have to work with attachNamepace() and figure out what this expects before it is called in in library(). Any ideas?
EDIT
With respect to Hadley's comment: I actually wouldn't care whether the attached environment is a full-grown namespace or just an ordinary environment as long as I could extend :: while keeping the "syntactic sugering" feature (i.e. being able to call pkg::foo() instead of "::"(pkg="pkg", name="foo")()).
This is how function "::" looks like:
> get("::")
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
getExportedValue(pkg, name)
}
This is what it should also be able to do in case R detects that pkg is in fact not a namespace but just some environment attached to the search path:
"::*" <- function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
paths <- search()
if (!pkg %in% paths) stop(paste("Invalid namespace environment:", pkg))
pos <- which(paths == pkg)
if (length(pos) > 1) stop(paste("Multiple attached envirs:", pkg))
get(x=name, pos=pos)
}
It works, but there's no syntactic sugaring:
> "::*"(pkg="tempspace", name="foo")
function(x, y) x + y
> "::*"(pkg="tempspace", name="foo")(x=1, y=2)
[1] 3
How would I be able to call pkg::*foo(x=1, y=2) (disregarding the fact that ::* is a really bad name for a function ;-))?

There is something wrong in your motivation: your namespace does NOT have to be attached to the search path in order to use the '::' notation, it is actually the opposite.
The search path allows symbols to be picked by looking at all namespaces attached to the search path.
So, as Hadley told you, you just have to use devtools::load_all(), that's all...

Related

Why/how some packages define their functions in nameless environment?

In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}

Change search() locations without unloading packages

I am using code which depends on two packages that conflict. I would like to give one priority for only a short period of time and my plan is to just move it up to the front of search(). However, I can't just unload and reload. I tried that and it causes other problems, and running library on an already loaded package does not work.
Here is an example (the real use case involves non-CRAN packages):
library(ggplot2)
library(MASS)
> search()
[1] ".GlobalEnv" "package:MASS" "package:ggplot2"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"
How can I now move package:ggplot2 ahead of package:MASS without detaching/unloading ggplot2?
EDIT
Inside the function I need to call, say function1, there is an expression that makes further calls. I cannot edit those calls to append ::.
e.g.
unchangeable <- function1("abc") ~ function2("def")
Suppose mainFun is the one I want to call. I can do
mainFun(unchangeable)
but I cannot specify
mainFun::unchangeable
It is indeed possible to edit unchangeable by manipulating formula objects. But that is not ideal and I need a more general solution for an object of other types.
EDIT2:
Here is an example, which shows a similar problem.
library(mgcv)
library(gam)
y <- rnorm(100)
x <- rnorm(100)
thisformula <- y ~ s(x)
gamgam <- gam(thisformula)
# s <- mgcv::s
mgcvgam <- mgcv::gam(thisformula)
This gives me the error
Error: $ operator is invalid for atomic vectors
Uncommenting the line s <- mgcv::s solves the problem in this case. But in my more general case it doesn't, and in any case it seems like a hack. How can I have all functions that are called within mgcv::gam first be looked up in mgcv?
You can refer to the function in the specific package using ::. For example ggplot2::labs will always refer to that function under ggplot2 even if it is masked by some later package being loaded

See if a variable/function exists in a package?

I'm trying to test whether a particular variable or function exists in a package. For example, suppose I wanted to test whether a function called plot existed in package 'graphics'.
The following tests whether a function plot exists, but not what package it comes from:
exists('plot', mode='function')
Or I can test that something called plot exists in the graphics package, but this doesn't tell me whether it's a function:
'plot' %in% ls('package:graphics')
Is there a nice way to ask "does an object called X exist in package Y of mode Z"? (Essentially, can I restrict exists to a particular package?)
(Yes, I can combine the above two lines to first test that plot is in graphics and then ask for the mode of plot, but what if I had my own function plot masking graphics::plot? Could I then trust the output of exists('plot', mode='function')? )
Background: writing tests for a package of mine and want to test that various functions are exported. I'm using package testthat which executes tests in an environment where I can see all the internal functions of the package, and have been stung by exists('myfunction', mode='function') returning true, but I've actually forgotten to export myfunction. I want to test that various functions are exported.
Oh, I found it.
I noticed that in ?ls it says that the first argument ('package:graphics' for me) also counts as an environment. exists' where argument has the same documentation as ls' name argument, so I guessed 'package:graphics' might work there too:
exists('plot', where='package:graphics', mode='function')
[1] TRUE # huzzah!
environment(plot)
<environment: namespace:graphics>
find('+')
#[1] "package:base"
find('plot')
#[1] "package:graphics"
find('plot', mode="function")
#[1] "package:graphics"
All the answers proposed previously walk environments, not namespaces. Their behavior will therefore differ depending on whether the target package was loaded with library(), and in what order.
Here is an approach with utils::getFromNamespace():
function_exists <- function(package, funcname) {
tryCatch({
utils::getFromNamespace(funcname, package)
TRUE
}, error = function(...) { FALSE })
}

How can a non-imported method in a not-attached package be found by calls to functions not having it in their namespace?

An R namespace acts as the immediate environment for all functions in its associated package. In other words, when function bar() from package foo calls another function, the R evaluator first searches for the other function in <environment: namespace:foo>, then in "imports.foo", <environment: namespace:base>, <environment: R_GlobalEnv>, and so on down the search list returned by typing search().
One nice aspect of namespaces is that they can make packages act like better citizens: unexported functions in <environment: namespace:foo> and functions in imports:foo are available only: (a) to functions in foo; (b) to other packages that import from foo; or (c) via fully qualified function calls like foo:::bar().
Or so I thought until recently...
The behavior
This recent SO question highlighted a case in which a function well-hidden in its package's namespace was nonetheless found by a call to a seemingly unrelated function:
group <- c("C","F","D","B","A","E")
num <- c(12,11,7,7,2,1)
data <- data.frame(group,num)
## Evaluated **before** attaching 'gmodels' package
T1 <- transform(data, group = reorder(group,-num))
## Evaluated **after** attaching 'gmodels
library(gmodels)
T2 <- transform(data, group = reorder(group,-num))
identical(T1, T2)
# [1] FALSE
Its immediate cause
#Andrie answered the original question by pointing out that gmodels imports from the the package gdata, which includes a function reorder.factor that gets dispatched to inside the second call to transform(). T1 differs from T2 because the first is calculated by stats:::reorder.default() and the second by gdata:::reorder.factor().
My question
How is it that in the above call to transform(data, group=reorder(...)), the dispatching mechanism for reorder finds and then dispatches to gdata:::reorder.factor()?
(An answer should include an explanation of the scoping rules that lead from a call involving functions in the stats and base packages to a seemingly well-hidden method in gdata.)
Further possibly helpful details
Neither gdata:::reorder.factor, nor the gdata package as a whole are explicitly imported by gmodels. Here are the import* directives in gmodels' NAMESPACE file:
importFrom(MASS, ginv)
importFrom(gdata, frameApply)
importFrom(gdata, nobs)
There are no methods for reorder() or transform() in <environment: namespace:gmodels>, nor in "imports:gmodels":
ls(getNamespace("gmodels"))
ls(parent.env(getNamespace("gmodels")))
Detaching gmodels does not revert reorder()'s behavior: gdata:::reorder.factor() still gets dispatched:
detach("package:gmodels")
T3 <- transform(data, group=reorder(group,-num))
identical(T3, T2)
# [1] TRUE
reorder.factor() is not stored in the list of S3 methods in the base environment:
grep("reorder", ls(.__S3MethodsTable__.))
# integer(0)
R chat threads from the last couple of days include some additional ideas. Thanks to Andrie, Brian Diggs, and Gavin Simpson who (with others) should feel free to edit or add possibly impt. details to this question.
I'm not sure if I correctly understand your question, but the main point is that group is character vector while data$group is factor.
After attaching gmodels, the call for reorder(factor) calls gdata:::reorder.factor.
so, reorder(factor(group)) calls it.
In transform, the function is evaluated within the environment of the first argument, so in T2 <- transform(data, group = reorder(group,-num)), group is factor.
UPDATED
library attaches the import packages into loaded namespace.
> loadedNamespaces()
[1] "RCurl" "base" "datasets" "devtools" "grDevices" "graphics" "methods"
[8] "stats" "tools" "utils"
> library(gmodels) # here, namespace:gdata is loaded
> loadedNamespaces()
[1] "MASS" "RCurl" "base" "datasets" "devtools" "gdata" "gmodels"
[8] "grDevices" "graphics" "gtools" "methods" "stats" "tools" "utils"
Just in case, the reorder generic exists in namespace:stats:
> r <- ls(.__S3MethodsTable__., envir = asNamespace("stats"))
> r[grep("reorder", r)]
[1] "reorder" "reorder.default" "reorder.dendrogram"
And for more details
The call of reorder will search the S3generics in two envs:
see ?UseMethod
first in the environment in which the generic function is called, and then in the registration data base for the environment in which the generic is defined (typically a namespace).
then, loadNamespace registers the S3 functions to the namespace.
So , in your case, library(gmodels) -> loadNamespace(gdata) -> registerS3Methods(gdata).
After this, you can find it by:
> methods(reorder)
[1] reorder.default* reorder.dendrogram* reorder.factor*
Non-visible functions are asterisked
However, as the reorder.factor is not attached on your search path, you cannot access it directly:
> reorder.factor
Error: object 'reorder.factor' not found
Probably this is whole scenario.

Getting the contents of a library interactively in R

Is there an equivalent of dir function (python) in R?
When I load a library in R like -
library(vrtest)
I want to know all the functions that are in that library.
In Python, dir(vrtest) would be a list of all attributes of vrtest.
I guess in general, I am looking for the best way to get help on R while running it in ESS on linux. I see all these man pages for the packages I have installed, but I am not sure how I can access them.
Thanks
help(package = packagename) will list all non-internal functions in a package.
Yes, use ls().
You can use search() to see what's in the search path:
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
You can search a particular package with the full name:
> ls("package:graphics")
[1] "abline" "arrows" "assocplot" "axis"
....
I also suggest that you look at this related question on stackoverflow which includes some more creative approaching to browsing the environment. If you're using ESS, then you can use Ess-rdired.
To get the help pages on a particular topic, you can either use help(function.name) or ?function.name. You will also find the help.search() function useful if you don't know the exact function name or package. And lastly, have a look at the sos package.
help(topic) #for documentation on a topic
?topic
summary(mydata) #an overview of data objects try
ls() # lists all objects in the local namespace
str(object) # structure of an object
ls.str() # structure of each object returned by ls()
apropos("mytopic") # string search of the documentation
All from the R reference card

Resources