I have successfully redefined utils::View as a generic function so that I can use it my package. However, it so happens that RStudio also defines some kind of hook on this function.
Before loading my package, I see:
> View
function (...)
.rs.callAs(name, hook, original, ...)
<environment: 0x000001f74d5ff0b0>
And looking for that .rs.callAs function, I get:
> findFunction('.rs.callAs')
[[1]]
<environment: 0x000001f74eb94598>
attr(,"name")
[1] "tools:rstudio"
After loading my package, I see:
> View
standardGeneric for "View" defined from package "summarytools"
function (...)
standardGeneric("View")
<bytecode: 0x000001f752ecb7e0>
<environment: 0x000001f754a8e678>
Methods may be defined for arguments: ...
Use showMethods("View") for currently available ones.
Since tools:rstudio is not directly visible, I'm not sure there's anything I can do about it. And if I can somehow include its definition in my package, I'm not sure at all that I can redefine View differently depending on whether the R session is running in RStudio or not.
I'm clearly not very optimistic about this, but I thought of asking here before giving up!
Related
In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}
I want to attach functions from a custom environment to the global environment, while masking possible internal functions.
Specifically, say that f() uses an internal function g(), then:
f() should not be visible in .GlobalEnv with ls(all=TRUE).
f() should be usable from .GlobalEnv.
f() internal function g() should not be visible and not usable from .GlobalEnv.
First let us create environments and functions as follows:
assign('ep', value=new.env(parent=.BaseNamespaceEnv), envir=.BaseNamespaceEnv)
assign('e', value=new.env(parent=ep), envir=ep)
assign('g', value=function() print('hello'), envir=ep)
assign('f', value=function() g(), envir=ep$e)
ls(.GlobalEnv)
## character(0)
Should I run now:
ep$e$f()
## Error in ep$e$f() (from #1) : could not find function "g"
In fact, the calling environment of f is:
environment(get('f', envir=ep$e))
## <environment: R_GlobalEnv>
where g is not present.
Trying to change f's environment gives an error:
environment(get('f', envir=ep$e))=ep
## Error in environment(get("f", envir = ep$e)) = ep :
## target of assignment expands to non-language object
Apparently it works with:
environment(ep$e$f)=ep
attach(ep$e)
Now, as desired, only f() is usable from .GlobalEnv, g() is not.
f()
[1] "hello"
g()
## Error: could not find function "g" (intended behaviour)
Also, neither f() nor g() are visible from .GlobalEnv, but unfortunately:
ls(.GlobalEnv)
## [1] "ep"
Setting the environment associated with f() to ep, places ep in .GlobalEnv.
Cluttering the Global environment was exactly what I was trying to avoid.
Can I reset the parent environment of f without making it visible from the Global one?
UPDATE
From your feedback, you suggest to build a package to get proper namespace services.
The package is not flexible. My helper functions are stored in a project subdir, say hlp, and sourced like source("hlp/util1.R").
In this way scripts can be easily mixed and updated on the fly on a project basis.
(Added new enumerated list on top)
UPDATE 2
An almost complete solution, which does not require external packages, is now here.
Either packages or modules do exactly what you want. If you’re not happy with packages’ lack of flexibility, I suggest you give ‘box’ modules a shot: they elegantly solve your problem and allow you to treat arbitrary R source files as modules:
Just mark public functions inside the module with the comment #' #export, and load it via
box::use(./foo)
foo$f()
or
box::use(./foo[...])
f()
This fulfils all the points in your enumeration. In particular, both pieces of code make f, but not g, available to the caller. In addition, modules have numerous other advantages over using source.
On a more technical note, your code results in ep being inside the global environment because the assignment environment(ep$e$f)=ep creates a copy of ep inside your global environment. Once you’ve attached the environment, you can delete this object. However, the code still has issues (it’s more complex than necessary and, as Hong Ooi mentioned, you shouldn’t mess with the base namespace).
First, you shouldn't be messing around with the base namespace. Cluttering up the base because you don't want to clutter up the global environment is just silly.*
Second, you can use local() as a poor-man's namespacing:
e <- local({
g <- function() "hello"
f <- function() g()
environment()
})
e$f()
# [1] "hello"
* If what you have in mind is a method for storing package state, remember that (essentially) anything you put in the global environment will be placed in its own namespace when you package it up. So don't worry about cluttering things up.
why is predict not a generic function? isGeneric('predict') is FALSE but isGeneric('summary') and isGeneric('print') is TRUE. All of them have methods, which can be listed with methods('predict') etc? So predict is not a generic function as described here (also obvious from looking at class) but still dispatches methods depending on the object passed to it (e.g. predict.lm or predict.glm). So there is a different way R dispatches methods? How can I test whether a function has methods so that all of the examples above are true? Yes, I can test the length of methods('predict') but that produces a warning for functions without methods.
For starters, none of those functions are generic by your test:
> isGeneric('predict')
[1] FALSE
> isGeneric('summary')
[1] FALSE
> isGeneric('print')
[1] FALSE
Let's try again...
> isGeneric("summary")
[1] FALSE
> require(sp)
Loading required package: sp
> isGeneric("summary")
[1] TRUE
What's going on here? Well, isGeneric only tests for S4 generic functions, and when I start R summary is an S3 generic function. If a package wants to use S4 methods and classes and there already exists an S3 generic function then it can create an S4 generic.
So, initially summary is:
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x9e4fc08>
<environment: namespace:base>
which is an S3 generic. I get the sp package...
> require(sp)
Loading required package: sp
> summary
standardGeneric for "summary" defined from package "base"
function (object, ...)
standardGeneric("summary")
<environment: 0x9f9d428>
Methods may be defined for arguments: object
Use showMethods("summary") for currently available ones.
and now summary is an S4 standard generic.
S3 generic methods despatch (usually) by calling {generic}.{class}, and this is what UseMethod("summary") does in the S3 summary generic function.
If you want to test if a function has methods for a particular class, then you probably have to test that it has an S4 method (using the functions for S4 method metadata) and an S3 method (by looking for a function called {generic}.{class}, such as summary.glm.
Great eh?
My general understanding of S3 generics comes from the R Language manual where I'm lead to believe that a function func cannot be a generic without calling UseMethod in its body. So, without any fancy packages, you can use the following gist:
isFuncS3Generic <- function(func) {
funcBody <- body(func)
sum(grepl(pattern="UseMethod", x=funcBody)) != 0
}
edit: using your litmus test:
> isFuncS3Generic(print)
[1] TRUE
> isFuncS3Generic(predict)
[1] TRUE
> isFuncS3Generic(summary)
[1] TRUE
> isFuncS3Generic(glm)
[1] FALSE
#Hadley, I'm not really sure why this is a hard problem. Would you mind elaborating?
This question is sort of a follow up to this post as I'm still not fully convinced that, with respect to code robustness, it wouldn't be far better to make typing namespace::foo() habit instead of just typing foo() and praying you get the desired result ;-)
Actual question
I'm aware that this goes heavily against "standard R conventions", but let's just say I'm curious ;-) Is it possible to attach a temporary namespace to the search path somehow?
Motivation
At a point where my package mypkg is still in "devel stage" (i.e. not a true R package yet):
I'd like to source my functions into an environment mypkg instead of .GlobalEnv
then attach mypkg to the search path (as a true namespace if possible)
in order to be able to call mypkg::foo()
I'm perfectly aware that calling :: has its downsides (it takes longer than simply typing a function's name and letting R handle the lookup implicitly) and/or might not be considered necessary due to the way a) R scans through the search path and b) packages may import their dependencies (i.e. using "Imports" instead of "Depends", not exporting certain functions etc). But I've seen my code crash at least twice due to the fact that some package has overwritten certain (base) functions, so I went from "blind trust" to "better-to-be-safe-than-sorry" mode ;-)
What I tried
AFAIU, namespaces are in principle nothing more than some special kind of environment
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
> asNamespace("base")
<environment: namespace:base>
And there's the attach() function that attaches objects to the search path. So here's what I thought:
temp.namespace <- new.env(parent=emptyenv())
attach(temp.namespace)
> asNamespace("temp.namespace")
Error in loadNamespace(name) :
there is no package called 'temp.namespace'
I guess I somehow have to work with attachNamepace() and figure out what this expects before it is called in in library(). Any ideas?
EDIT
With respect to Hadley's comment: I actually wouldn't care whether the attached environment is a full-grown namespace or just an ordinary environment as long as I could extend :: while keeping the "syntactic sugering" feature (i.e. being able to call pkg::foo() instead of "::"(pkg="pkg", name="foo")()).
This is how function "::" looks like:
> get("::")
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
getExportedValue(pkg, name)
}
This is what it should also be able to do in case R detects that pkg is in fact not a namespace but just some environment attached to the search path:
"::*" <- function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
paths <- search()
if (!pkg %in% paths) stop(paste("Invalid namespace environment:", pkg))
pos <- which(paths == pkg)
if (length(pos) > 1) stop(paste("Multiple attached envirs:", pkg))
get(x=name, pos=pos)
}
It works, but there's no syntactic sugaring:
> "::*"(pkg="tempspace", name="foo")
function(x, y) x + y
> "::*"(pkg="tempspace", name="foo")(x=1, y=2)
[1] 3
How would I be able to call pkg::*foo(x=1, y=2) (disregarding the fact that ::* is a really bad name for a function ;-))?
There is something wrong in your motivation: your namespace does NOT have to be attached to the search path in order to use the '::' notation, it is actually the opposite.
The search path allows symbols to be picked by looking at all namespaces attached to the search path.
So, as Hadley told you, you just have to use devtools::load_all(), that's all...
I'm working on an R package where one of the functions contains a match.fun call to a function in a package that's imported to the package namespace. But on loading the package, the match.fun call can't find the function name. From Hadley Wickham's description I think I'm doing everything right, but this is clearly not the case.
Example:
# in the package file header, for creation of the NAMESPACE via roxygen2:
##` #import topicmodels
# The function declaration in the package
ModelTopics <- function(doc.term.mat, num.topics, topic.method="LDA"){
topic.fun <- match.fun(topic.method)
output <- topic.fun(doc.term.mat, k=num.topics)
return(output)
}
And then in R:
> library(mypackage)
> sample.output <- ModelTopics(my.dtm, topic.method="LDA", num.topics=5)
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'LDA' of mode 'function' was not found
From my understanding of namespaces, the match.fun call should have access to the package namespace, which should include the topicmodels functions. But that doesn't appear to be the case here. If I import topicmodels directly to the global namespace for the R session, then this works.
Any help is much appreciated. This is R64 2.14.1 running on OSX.
UPDATE:
The package is here
Re the DESCRIPTION file, perhaps that's the problem: roxygen2 doesn't update the DESCRIPTION file with Imports: statements. But none of the other packages are listed there, either; only the match.fun calls appear to be affected.
Re the NAMESPACE extract, here's the import section:
import(catspec)
import(foreach)
import(gdata)
import(Hmisc)
import(igraph)
import(lsa)
import(Matrix)
import(plyr)
import(RecordLinkage)
import(reshape)
import(RWeka)
import(stringr)
import(tm)
import(topicmodels)
I believe this an issue of scope. Although you have imported topicmodels and thus LDA, you haven't exported these functions, so they aren't available in the search path.
From ?match.fun:
match.fun is not intended to be used at the top level since it will
perform matching in the parent of the caller.
Thus, since you are using ModelTopics in the global environment, and LDA isn't available in the global environment, the match.fun call fails.
It seems to me you have two options:
Option 1: use get
An alternative would be to use get where you can specify the environment. Consider this: try to use match.fun to find print.ggplot in the package ggplot2:
match.fun("print.ggplot")
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'print.ggplot' of mode 'function' was not found
Since print.ggplot isn't exported, match.fun can't find it.
However, get does find it:
get("print.ggplot", envir=environment(ggplot))
function (x, newpage = is.null(vp), vp = NULL, ...)
{
set_last_plot(x)
if (newpage)
grid.newpage()
data <- ggplot_build(x)
gtable <- ggplot_gtable(data)
if (is.null(vp)) {
grid.draw(gtable)
}
else {
if (is.character(vp))
seekViewport(vp)
else pushViewport(vp)
grid.draw(gtable)
upViewport()
}
invisible(data)
}
<environment: namespace:ggplot2>
Option 2: Export the functions from topicmodels that you need
If you make the necessary functions from topicmodels available in your NAMESPACE, your code should also work.
Do this by either:
Selective exporting LDA and other functions to the namespace using #export
Declare a dependency, using Depends: topicmodels - this is the same as calling library(topicmodels) in the global environment. This will work, but is possibly a bit of a brute force approach. It also means that any subsequent library call can mask your LDA function, thus leading to unexpected results.
Answering my own question: the DESCRIPTION file wasn't updating the Imports: and Depends: fields after re-roxygenizing the updated code. Hence the match.fun issues. Out of curiousity, why does this affect match.fun but not a range of other function calls made elsewhere to imported package functions?