R: Evaluating a script in an environment - r

I would like to load a library function within a script evaluated in a specified environment.
Example:
## foo.R
## -----
## blah blah
library(extrafont)
loadfonts()
Assuming for convenience the evaluation environment is the base environment:
sys.source("foo.R")
## Registering fonts with R
## Error in eval(expr, envir, enclos) : could not find function "loadfonts"
Replacing loadfonts() with extrafont:::loadfonts() works better, but still gives:
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'pdfFonts' of mode 'function' was not found
because loadfonts() requires pdfFonts() defined in grDevices.

This is both a not totally satisfactory answer and a long comment to #waterling.
The proposed solution is:
e<- new.env()
source("foo.R", local=e)
i.e.
source("foo.R", local=new.env())
which is substantially equivalent to:
sys.source("foo.R", envir=new.env())
It works for much the same reason why:
sys.source("foo.R", envir=as.environment("package:grDevices"))
As reported in the error (see question), the function not found, pdfFonts() is part of the package grDevices The sys.source above executes the script in the package:grDevices environment, hence the function is found. Instead by default sys.source(..., envir=baseenv()) and the base environment comes before grDevices, therefore pdfFonts() is not found.
A first problem is that I do not know in advance which functions will happen to be in my script.
In this case setting envir=new.env() is a more general approach. By default
new.env(parent=parent.frame()),
therefore it has the same parent of sys.source(), which is the global environment. So everything visible in the global environment is visible in the script with sys.source(..., envir=new.env()), that is every object created by the user and by the user loaded packages.
The problem here is that we are not insulating the script any more, which makes it less reproducible and stable. In fact, it depends on what is in R memory in the very moment we call sys.source.
To make the things more practical, it means foo.R might work just because we usually call it after bar.R.
A second problem is that this not an actual solution.
The question concerns how to run a script foo.R in an environment e and still access, when needed, to functions not belonging to e. Taking an e that (directly or through its parents) has access to these functions is actually a workaround, not a solution.
If this type of workaround is the only possible way to go, IMHO, the best is to make it dependent only on standard R packages.
At start, R shows:
search()
## [1] ".GlobalEnv" "package:stats" "package:graphics"
## [4] "package:grDevices" "package:utils" "package:datasets"
## [7] "package:methods" "Autoloads" "package:base"
that is eight official packages/environments.
New packages/environments, unless explicitly changing the default, go into the second slot and all those after the first one shift one position.
myEnv=new.env()
attach(myEnv)
search()
## [1] ".GlobalEnv" "myEnv" "package:stats"
## [4] "package:graphics" "package:grDevices" "package:utils"
## [7] "package:datasets" "package:methods" "Autoloads"
## [10] "package:base"
So we can take the last eight in the search path, which means taking the first of these eight inheriting the others. We need:
pos.to.env(length(search()) - 7)
## <environment: package:stats>
## attr(,"name")
## [1] "package:stats"
## attr(,"path")
## [1] "path/to//R/R-x.x.x/library/stats"
Therefore:
sys.source("foo.R", envir=new.env(parent=pos.to.env(length(search()) - 7)))
or one can take a standard R reference package, say stats, and its parents.
Therefore:
sys.source("foo.R", envir=new.env(parent=as.environment("package:stats")))
UPDATE
I found the
SOLUTION
As for the script:
#foo.R
#-----
library(extrafont)
f=function() loadfonts()
environment(f) = as.environment("package:extrafont")
f()
To execute in a new environment:
sys.source("foo.R", envir=new.env(parent=baseenv()))
f() now has access to all objects in the package extrafont and those loaded before it.
In sys.source() creating a new.env() with whatever parent is necessary to make environment() assignment work.

Related

How R differentiates between the two filter function one in dplyr package and other for linear filtering in time series?

I wanted to filter a data set based on some conditions. When I looked at the help for filter function the result was:
filter {stats} R Documentation
Linear Filtering on a Time Series
Description
Applies linear filtering to a univariate time series or to each series separately of a multivariate time series.
After searching on web I found the filter function I needed i.e. from dplyr package. How can R have two functions with same name. What am I missing here?
At the moment the R interpreter would dispatch a call to filter to the dplyr environment, at least if the class of the object were among the avaialble methods:
methods(filter)
[1] filter.data.frame* filter.default* filter.sf* filter.tbl_cube* filter.tbl_df* filter.tbl_lazy*
[7] filter.ts*
As you can see there is a ts method, so if the object were of that class, the interpreter would instead deliver the x values to it. However, it appears that the authors of dplyr have blocked that mechanism and instead put in a warning function. You would need to use:
getFromNamespace('filter', 'stats')
function (x, filter, method = c("convolution", "recursive"),
sides = 2L, circular = FALSE, init = NULL)
{ <omitting rest of function body> }
# same result also obtained with:
stats::filter
R functions are contained in namespaces, so a full designation of a function would be: namespace_name::function_name. There is a hierarchy of namespace containers (actually "environments" in R terminology) arranged along a search path (which will vary depending on the order in which packages and their dependencies have been loaded). The ::-infix-operator can be used to specify a namespace or package name that is further up the search path than might be found in the context of the calling function. The function search can display the names of currently loaded packages and their associated namespaces. See ?search Here's mine at the moment (which is a rather bloated one because I answer a lot of questions and don't usually start with a clean systems:
> search()
[1] ".GlobalEnv" "package:kernlab" "package:mice" "package:plotrix"
[5] "package:survey" "package:Matrix" "package:grid" "package:DHARMa"
[9] "package:eha" "train" "package:SPARQL" "package:RCurl"
[13] "package:XML" "package:rnaturalearthdata" "package:rnaturalearth" "package:sf"
[17] "package:plotly" "package:rms" "package:SparseM" "package:Hmisc"
[21] "package:Formula" "package:survival" "package:lattice" "package:remotes"
[25] "package:forcats" "package:stringr" "package:dplyr" "package:purrr"
[29] "package:readr" "package:tidyr" "package:tibble" "package:ggplot2"
[33] "package:tidyverse" "tools:rstudio" "package:stats" "package:graphics"
[37] "package:grDevices" "package:utils" "package:datasets" "package:methods"
[41] "Autoloads"
At the moment I can find instances of 3 versions of filter using the help system:
?filter
# brings this up in the help panel
Help on topic 'filter' was found in the following packages:
Return rows with matching conditions
(in package dplyr in library /home/david/R/x86_64-pc-linux-gnu-library/3.5.1)
Linear Filtering on a Time Series
(in package stats in library /usr/lib/R/library)
Objects exported from other packages
(in package plotly in library /home/david/R/x86_64-pc-linux-gnu-library/3.5.1)

How does R handle conflict for a method present in multiple packages?

Suppose there is a method (let's say "sample ()") and that it is present in multiple packages (let's say both in package "base" and "arules"). Now if I call sample () which package is called, does it call the package "base" or "arules" and how does it decide which one to call?
It selects which one comes first on the search path:
search()
[1] ".GlobalEnv" "package:arules" "package:Matrix"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"
So it would be the arules version. This is an S4 method which actually may call the base version anyway. Note that base is always last on the search path, and the global environment is always first. Generally packages are loaded in second place (can be changed with the pos argument to library), and moved down as others are loaded.
I use this:
base::sample()
arules::sample()

Change search() locations without unloading packages

I am using code which depends on two packages that conflict. I would like to give one priority for only a short period of time and my plan is to just move it up to the front of search(). However, I can't just unload and reload. I tried that and it causes other problems, and running library on an already loaded package does not work.
Here is an example (the real use case involves non-CRAN packages):
library(ggplot2)
library(MASS)
> search()
[1] ".GlobalEnv" "package:MASS" "package:ggplot2"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"
How can I now move package:ggplot2 ahead of package:MASS without detaching/unloading ggplot2?
EDIT
Inside the function I need to call, say function1, there is an expression that makes further calls. I cannot edit those calls to append ::.
e.g.
unchangeable <- function1("abc") ~ function2("def")
Suppose mainFun is the one I want to call. I can do
mainFun(unchangeable)
but I cannot specify
mainFun::unchangeable
It is indeed possible to edit unchangeable by manipulating formula objects. But that is not ideal and I need a more general solution for an object of other types.
EDIT2:
Here is an example, which shows a similar problem.
library(mgcv)
library(gam)
y <- rnorm(100)
x <- rnorm(100)
thisformula <- y ~ s(x)
gamgam <- gam(thisformula)
# s <- mgcv::s
mgcvgam <- mgcv::gam(thisformula)
This gives me the error
Error: $ operator is invalid for atomic vectors
Uncommenting the line s <- mgcv::s solves the problem in this case. But in my more general case it doesn't, and in any case it seems like a hack. How can I have all functions that are called within mgcv::gam first be looked up in mgcv?
You can refer to the function in the specific package using ::. For example ggplot2::labs will always refer to that function under ggplot2 even if it is masked by some later package being loaded

Attaching a temporary namespace to the search path

This question is sort of a follow up to this post as I'm still not fully convinced that, with respect to code robustness, it wouldn't be far better to make typing namespace::foo() habit instead of just typing foo() and praying you get the desired result ;-)
Actual question
I'm aware that this goes heavily against "standard R conventions", but let's just say I'm curious ;-) Is it possible to attach a temporary namespace to the search path somehow?
Motivation
At a point where my package mypkg is still in "devel stage" (i.e. not a true R package yet):
I'd like to source my functions into an environment mypkg instead of .GlobalEnv
then attach mypkg to the search path (as a true namespace if possible)
in order to be able to call mypkg::foo()
I'm perfectly aware that calling :: has its downsides (it takes longer than simply typing a function's name and letting R handle the lookup implicitly) and/or might not be considered necessary due to the way a) R scans through the search path and b) packages may import their dependencies (i.e. using "Imports" instead of "Depends", not exporting certain functions etc). But I've seen my code crash at least twice due to the fact that some package has overwritten certain (base) functions, so I went from "blind trust" to "better-to-be-safe-than-sorry" mode ;-)
What I tried
AFAIU, namespaces are in principle nothing more than some special kind of environment
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
> asNamespace("base")
<environment: namespace:base>
And there's the attach() function that attaches objects to the search path. So here's what I thought:
temp.namespace <- new.env(parent=emptyenv())
attach(temp.namespace)
> asNamespace("temp.namespace")
Error in loadNamespace(name) :
there is no package called 'temp.namespace'
I guess I somehow have to work with attachNamepace() and figure out what this expects before it is called in in library(). Any ideas?
EDIT
With respect to Hadley's comment: I actually wouldn't care whether the attached environment is a full-grown namespace or just an ordinary environment as long as I could extend :: while keeping the "syntactic sugering" feature (i.e. being able to call pkg::foo() instead of "::"(pkg="pkg", name="foo")()).
This is how function "::" looks like:
> get("::")
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
getExportedValue(pkg, name)
}
This is what it should also be able to do in case R detects that pkg is in fact not a namespace but just some environment attached to the search path:
"::*" <- function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
paths <- search()
if (!pkg %in% paths) stop(paste("Invalid namespace environment:", pkg))
pos <- which(paths == pkg)
if (length(pos) > 1) stop(paste("Multiple attached envirs:", pkg))
get(x=name, pos=pos)
}
It works, but there's no syntactic sugaring:
> "::*"(pkg="tempspace", name="foo")
function(x, y) x + y
> "::*"(pkg="tempspace", name="foo")(x=1, y=2)
[1] 3
How would I be able to call pkg::*foo(x=1, y=2) (disregarding the fact that ::* is a really bad name for a function ;-))?
There is something wrong in your motivation: your namespace does NOT have to be attached to the search path in order to use the '::' notation, it is actually the opposite.
The search path allows symbols to be picked by looking at all namespaces attached to the search path.
So, as Hadley told you, you just have to use devtools::load_all(), that's all...

Getting the contents of a library interactively in R

Is there an equivalent of dir function (python) in R?
When I load a library in R like -
library(vrtest)
I want to know all the functions that are in that library.
In Python, dir(vrtest) would be a list of all attributes of vrtest.
I guess in general, I am looking for the best way to get help on R while running it in ESS on linux. I see all these man pages for the packages I have installed, but I am not sure how I can access them.
Thanks
help(package = packagename) will list all non-internal functions in a package.
Yes, use ls().
You can use search() to see what's in the search path:
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
You can search a particular package with the full name:
> ls("package:graphics")
[1] "abline" "arrows" "assocplot" "axis"
....
I also suggest that you look at this related question on stackoverflow which includes some more creative approaching to browsing the environment. If you're using ESS, then you can use Ess-rdired.
To get the help pages on a particular topic, you can either use help(function.name) or ?function.name. You will also find the help.search() function useful if you don't know the exact function name or package. And lastly, have a look at the sos package.
help(topic) #for documentation on a topic
?topic
summary(mydata) #an overview of data objects try
ls() # lists all objects in the local namespace
str(object) # structure of an object
ls.str() # structure of each object returned by ls()
apropos("mytopic") # string search of the documentation
All from the R reference card

Resources