Why do some primitives have byte-codes and some do not? - r

I've noticed that when I call args on some of the primitive functions, byte-codes show up as well. But on other primitives, no byte-code appears. For example
args(length)
# function (x)
# NULL
args(list)
# function (...)
# NULL
# <bytecode: 0x44a0f38>
Why is that?
At first I thought it might be related to the ... argument, but the following disproves that theory.
args(dim)
# function (x)
# NULL
args(unclass)
# function (x)
# NULL
# <bytecode: 0x44a0450>
It's confusing to me that a byte-code only shows up in some of these, and not in others. I have always been under the impression that all primitives are special and that they all share the same "attributes" (for lack of a better word, not the actual R attributes).

As agstudy noted, this is an oddity related to how args prints things. That is, whether args includes a bytecode line in its output isn't a reliable indicator of whether or not the function was byte compiled. compare:
args(writeLines)
## function (text, con = stdout(), sep = "\n", useBytes = FALSE)
## NULL
writeLines
## function (text, con = stdout(), sep = "\n", useBytes = FALSE)
## {
## if (is.character(con)) {
## con <- file(con, "w")
## on.exit(close(con))
## }
## .Internal(writeLines(text, con, sep, useBytes))
## }
## <bytecode: 0x000000001bf3aeb0>
We can compare printing of a bytecode line for args vs. standard function printing.
arg_shows_bytecode <- function(fn)
{
output <- capture.output(args(fn))
grepl("^<bytecode", output[length(output)])
}
printing_shows_bytecode <- function(fn)
{
output <- capture.output(print(fn))
length(output) > 1 && grepl("^<bytecode", output[length(output) - 1])
}
base_fns <- Filter(is.function, mget(ls(baseenv()), baseenv()))
yn_args <- vapply(base_fns, arg_shows_bytecode, logical(1))
yn_print <- vapply(base_fns, printing_shows_bytecode, logical(1))
It's worth noting that all functions where args shows bytecode information are primitives.
head(base_fns[yn_args])
## $`%*%`
## function (x, y) .Primitive("%*%")
##
## $as.call
## function (x) .Primitive("as.call")
##
## $attr
## function (x, which, exact = FALSE) .Primitive("attr")
##
## $`attr<-`
## function (x, which, value) .Primitive("attr<-")
##
## $attributes
## function (obj) .Primitive("attributes")
##
## $`attributes<-`
## function (obj, value) .Primitive("attributes<-")
The converse isn't true: some base functions where args doesn't show bytecode information are primitives; others are not.
yn_prim <- vapply(base_fns, is.primitive, logical(1))
table(yn_args, yn_print, yn_prim)
## , , yn_prim = FALSE
##
## yn_print
## yn_args FALSE TRUE
## FALSE 0 988
## TRUE 0 0
##
## , , yn_prim = TRUE
##
## yn_print
## yn_args FALSE TRUE
## FALSE 119 0
## TRUE 63 0
So non-primitive functions in the base package are all compiled, but args doesn't mention it. Primitive functions don't show a bytecode message when printed, and only sometimes show a bytecode message when called with args.

Thanks for the report. This behavior is unintentional (a bug, as Hadley says), it is not consistent internally as the bytecode address is displayed only for builtins and specials and only when their formals are in .ArgsEnv (they can also be in .GenericArgsEnv). Now fixed in R-devel. Bug reports are best directed right into R bugzilla (R-devel mailing list works as well).

Related

Does code in R throw a warning? Can I expect warning? [duplicate]

In R, how can I determine whether a function call results in a warning?
That is, after calling the function I would like to know whether that instance of the call yielded a warning.
If you want to use the try constructs, you can set the options for warn. See also ?options. Better is to use tryCatch() :
x <- function(i){
if (i < 10) warning("A warning")
i
}
tt <- tryCatch(x(5),error=function(e) e, warning=function(w) w)
tt2 <- tryCatch(x(15),error=function(e) e, warning=function(w) w)
tt
## <simpleWarning in x(5): A warning>
tt2
## [1] 15
if(is(tt,"warning")) print("KOOKOO")
## [1] "KOOKOO"
if(is(tt2,"warning")) print("KOOKOO")
To get both the result and the warning :
tryCatch(x(5),warning=function(w) return(list(x(5),w)))
## [[1]]
## [1] 5
##
## [[2]]
## <simpleWarning in x(5): A warning>
Using try
op <- options(warn=2)
tt <- try(x())
ifelse(is(tt,"try-error"),"There was a warning or an error","OK")
options(op)
On the R-help mailing list (see http://tolstoy.newcastle.edu.au/R/help/04/06/0217.html), Luke Tierney wrote:
"If you want to write a function that computes a value and collects all
warning you could do it like this:
withWarnings <- function(expr) {
myWarnings <- NULL
wHandler <- function(w) {
myWarnings <<- c(myWarnings, list(w))
invokeRestart("muffleWarning")
}
val <- withCallingHandlers(expr, warning = wHandler)
list(value = val, warnings = myWarnings)
}
2019 update
You can you use 'quietly' from the purrr package, which returns a list of output, result, warning and error. You can then extract each element by name. For instance, if you had a list, which you want to map a function over, and find the elements which returned a warning you could do
library(purrr)
library(lubridate)
datelist <- list(a = "12/12/2002", b = "12-12-2003", c = "24-03-2005")
# get all the everything
quiet_list <- map(datelist, quietly(mdy))
# find the elements which produced warnings
quiet_list %>% map("warnings") %>% keep(~ !is.null(.))
# or
quiet_list %>% keep(~ length(.$warnings) != 0)
For this example it's quite trivial, but for a long list of dataframes where the NAs might be hard to spot, this is quite useful.
here is an example:
testit <- function() warning("testit") # function that generates warning.
assign("last.warning", NULL, envir = baseenv()) # clear the previous warning
testit() # run it
if(length(warnings())>0){ # or !is.null(warnings())
print("something happened")
}
maybe this is somehow indirect, but i don't know the more straightforward way.
For a simple TRUE/FALSE return on whether a given operation results in a warning (or error), you could use the is.error function from the berryFunctions package, after first setting options(warn = 2) so that warnings are converted to errors.
E.g.,
options(warn = 2)
berryFunctions::is.error(as.numeric("x")) # TRUE
berryFunctions::is.error(as.numeric("3")) # FALSE
If you want to limit the option change to the use of this function, you could just create a new function as follows.
is.warningorerror <- function(x) {
op <- options()
on.exit(options(op))
options(warn = 2)
berryFunctions::is.error(x)
}
is.warningorerror(as.numeric("x")) # TRUE
options("warn") # still 0 (default)
I personally use the old good sink redirected into a text connection:
# create a new text connection directed into a variable called 'messages'
con <- textConnection("messages","w")
# sink all messages (i.e. warnings and errors) into that connection
sink(con,type = "message")
# a sample warning-generating function
test.fun <- function() {
warning("Your warning.")
return("Regular output.")
}
output <- test.fun()
# close the sink
sink(type="message")
# close the connection
close(con)
# if the word 'Warning' appears in messages than there has been a warning
warns <- paste(messages,collapse=" ")
if(grepl("Warning",warns)) {
print(warns)
}
# [1] "Warning message: In test.fun() : Your warning."
print(output)
# [1] "Regular output."
Possibly more straightforward and cleaner than the other suggested solutions.

capturing functions using rlang's enexprs

I'm writing a function such that callers of this function can write schemas declaratively:
myschema <- Schema(
patientID = character,
temp = numeric,
treated = logical,
reason_treated = factor(levels=c('fever', 'chills', 'nausea'))
)
Later, I'd to be able to assemble dataframes using the types declared in this schema. I think the best candidate for this job is to use the metaprogramming features available in rlang:
Schema = function(...) {
schematypes = rlang::enexprs(...)
}
However, most of the examples pertain to capturing the expression and thereafter using them as arguments to functions, rather than as functions themselves. That is, I'm finding it hard to capture the right side of the following expression:
patientID = character
and then later being able to evaluate it later as character(myvec), whenever I get myvec. The same applies to the following:
reason_treated = factor(levels=c('fever', 'chills', 'nausea'))
which I would later like to evaluate as factor(myvec, levels=c('fever', 'chills', 'nausea'))
Thanks!
If I understand correctly, you are effectively constructing a schema out of functions, and you want to apply those functions to some arguments when those become available. This falls under the umbrella of functional programming rather than rlang metaprogramming.
A large portion of the functionality you want is already captured by purrr::map and its "engine" as_mapper. You can employ it directly to define
Schema <- function(...) { purrr::map( list(...), purrr::as_mapper ) }
You can now employ it to build new schemas like you suggested (with minor modifications to function definitions):
myschema <- Schema(
patientID = as.character, # Note the correct function name
temp = as.numeric, # Ditto
treated = as.logical, # Tritto
reason_treated = ~factor(., levels=c('fever', 'chills', 'nausea'))
)
# $patientID
# function (x, ...)
# as.character(x = x, ...)
# <environment: base>
#
# $temp
# function (x, ...)
# as.double(x = x, ...)
# <environment: base>
#
# $treated
# function (x, ...)
# as.logical(x = x, ...)
# <environment: base>
#
# $reason_treated
# function (..., .x = ..1, .y = ..2, . = ..1)
# factor(., levels = c("fever", "chills", "nausea"))
# <bytecode: 0x00000000027a2d00>
Given your new schema, registering new patients can be done using a sister function of map that lines up arguments from two lists / vectors:
register_patient <- function(myvec) { purrr::map2( myschema, myvec, ~.x(.y) ) }
JohnDoe <- register_patient( c(1234, 100, TRUE, "fever") )
# $patientID
# [1] "1234"
#
# $temp
# [1] 100
#
# $treated
# [1] TRUE
#
# $reason_treated
# [1] fever
# Levels: fever chills nausea
Let's verify the type of each element:
purrr::map( JohnDoe, class )
# $patientID
# [1] "character"
#
# $temp
# [1] "numeric"
#
# $treated
# [1] "logical"
#
# $reason_treated
# [1] "factor"

R: preventing copies when passing a variable into a function

Hadley's new pryr package that shows the address of a variable is really great for profiling. I have found that whenever a variable is passed into a function, no matter what that function does, a copy of that variable is created. Furthermore, if the body of the function passes the variable into another function, another copy is generated. Here is a clear example
n = 100000
p = 100
bar = function(X) {
print(pryr::address(X))
}
foo = function(X) {
print(pryr::address(X))
bar(X)
}
X = matrix(rnorm(n*p), n, p)
print(pryr::address(X))
foo(X)
Which generates
> X = matrix(rnorm(n*p), n, p)
> print(pryr::address(X))
[1] "0x7f5f6ce0f010"
> foo(X)
[1] "0x92f6d70"
[1] "0x92f3650"
The address changes each time, despite the functions not doing anything. I'm confused by this behavior because I've heard R described as copy on write - so variables can be passed around but copies are only generated when a function wants to write into that variable. What is going on in these function calls?
For best R development is it better to not write multiple small functions, rather keep the content all in one function? I have also found some discussion on Reference Classes, but I see very little R developers using this. Is there another efficient way to pass the variable that I am missing?
I'm not entirely certain, but address may point to the memory address of the pointer to the object. Take the following example.
library(pryr)
n <- 100000
p <- 500
X <- matrix(rep(1,n*p), n, p)
l <- list()
for(i in 1:10000) l[[i]] <- X
At this point, if each element of l was a copy of X, the size of l would be ~3.5Tb. Obviously this is not the case as your computer would have started smoking. And yet the addresses are different.
sapply(l[1:10], function(x) address(x))
# [1] "0x1062c14e0" "0x1062c0f10" "0x1062bebc8" "0x10641e790" "0x10641dc28" "0x10641c640" "0x10641a800" "0x1064199c0"
# [9] "0x106417380" "0x106411d40"
pryr::address passes an unevaluated symbol to an internal function that returns its address in the parent.frame():
pryr::address
#function (x)
#{
# address2(check_name(substitute(x)), parent.frame())
#}
#<environment: namespace:pryr>
Wrapping of the above function can lead to returning address of a "promise". To illustrate we can simulate pryr::address's functionality as:
ff = inline::cfunction(sig = c(x = "symbol", env = "environment"), body = '
SEXP xx = findVar(x, env);
Rprintf("%s at %p\\n", type2char(TYPEOF(xx)), xx);
if(TYPEOF(xx) == PROMSXP) {
SEXP pr = eval(PRCODE(xx), PRENV(xx));
Rprintf("\tvalue: %s at %p\\n", type2char(TYPEOF(pr)), pr);
}
return(R_NilValue);
')
wrap1 = function(x) ff(substitute(x), parent.frame())
where wrap1 is an equivalent of pryr::address.
Now:
x = 1:5
.Internal(inspect(x))
##256ba60 13 INTSXP g0c3 [NAM(1)] (len=5, tl=0) 1,2,3,4,5
pryr::address(x)
#[1] "0x256ba60"
wrap1(x)
#integer at 0x0256ba60
#NULL
with further wrapping, we can see that a "promise" object is being constructed while the value is not copied:
wrap2 = function(x) wrap1(x)
wrap2(x)
#promise at 0x0793f1d4
# value: integer at 0x0256ba60
#NULL
wrap2(x)
#promise at 0x0793edc8
# value: integer at 0x0256ba60
#NULL
# wrap 'pryr::address' like your 'bar'
( function(x) pryr::address(x) )(x)
#[1] "0x7978a64"
( function(x) pryr::address(x) )(x)
#[1] "0x79797b8"
You can use the profmem package (I'm the author), to see what memory allocations take place. It requires that your R session was build with "profmem" capabilities:
capabilities()["profmem"]
## profmem
## TRUE
Then, you can do something like this:
n <- 100000
p <- 100
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
object.size(X)
## 80000200 bytes
## No copies / no new objects
bar <- function(X) X
foo <- function(X) bar(X)
## One new object
bar2 <- function(X) 2*X
foo2 <- function(X) bar2(X)
profmem::profmem(foo(X))
## Rprofmem memory profiling of:
## foo(X)
##
## Memory allocations:
## bytes calls
## total 0
profmem::profmem(foo2(X))
## Rprofmem memory profiling of:
## foo2(X)
##
## Memory allocations:
## bytes calls
## 1 80000040 foo2() -> bar2()
## total 80000040

Given a function defined in an R env, obtain function parameters

What I'm trying to do is trivial but I've not found a clear solution to it:
For instance, I have the following function:
sample.function <- function(a, b, named="test") {
...
}
I wish I could inspect the function and obtain the arguments (maybe as an R list), given ret is the returned value of the desired function, fhe following assertions should be all True
ret <- magicfunction(sample.function)
ret[[1]] == "a"
ret[[2]] == "b"
ret$named == "test"
can it be done?
Here are a few things you can look at, inside or outside of the function.
> f <- function(FUN = sum, na.rm = FALSE) {
c(formals(f), args(f), match.fun(FUN))
}
> f()
$FUN
sum
$na.rm
[1] FALSE
[[3]]
function (FUN = sum, na.rm = FALSE)
NULL
[[4]]
function (..., na.rm = FALSE) .Primitive("sum")
This will work if the function encloses its body with brace brackets (which nearly all functions do). It gives a list whose names are the argument names and whose values are the defaults:
sample.function <- function(a, b, named="test") {} # test function
L <- as.list(formals(sample.function))); L
## $a
##
## $b
##
## $named
## [1] "test"
This is slightly longer but works even for functions whose bodies are not surrounded by brace brackets:
head(as.list(args(sample.function)), -1)
# same output
head(as.list(args(sin)), -1) # sin has no {}
## $x
Returning to the first example, to examine the default values for missing:
sapply(L, identical, formals(function(x) {})$x)
## a b named
## TRUE TRUE FALSE
Revised

How can I check whether a function call results in a warning?

In R, how can I determine whether a function call results in a warning?
That is, after calling the function I would like to know whether that instance of the call yielded a warning.
If you want to use the try constructs, you can set the options for warn. See also ?options. Better is to use tryCatch() :
x <- function(i){
if (i < 10) warning("A warning")
i
}
tt <- tryCatch(x(5),error=function(e) e, warning=function(w) w)
tt2 <- tryCatch(x(15),error=function(e) e, warning=function(w) w)
tt
## <simpleWarning in x(5): A warning>
tt2
## [1] 15
if(is(tt,"warning")) print("KOOKOO")
## [1] "KOOKOO"
if(is(tt2,"warning")) print("KOOKOO")
To get both the result and the warning :
tryCatch(x(5),warning=function(w) return(list(x(5),w)))
## [[1]]
## [1] 5
##
## [[2]]
## <simpleWarning in x(5): A warning>
Using try
op <- options(warn=2)
tt <- try(x())
ifelse(is(tt,"try-error"),"There was a warning or an error","OK")
options(op)
On the R-help mailing list (see http://tolstoy.newcastle.edu.au/R/help/04/06/0217.html), Luke Tierney wrote:
"If you want to write a function that computes a value and collects all
warning you could do it like this:
withWarnings <- function(expr) {
myWarnings <- NULL
wHandler <- function(w) {
myWarnings <<- c(myWarnings, list(w))
invokeRestart("muffleWarning")
}
val <- withCallingHandlers(expr, warning = wHandler)
list(value = val, warnings = myWarnings)
}
2019 update
You can you use 'quietly' from the purrr package, which returns a list of output, result, warning and error. You can then extract each element by name. For instance, if you had a list, which you want to map a function over, and find the elements which returned a warning you could do
library(purrr)
library(lubridate)
datelist <- list(a = "12/12/2002", b = "12-12-2003", c = "24-03-2005")
# get all the everything
quiet_list <- map(datelist, quietly(mdy))
# find the elements which produced warnings
quiet_list %>% map("warnings") %>% keep(~ !is.null(.))
# or
quiet_list %>% keep(~ length(.$warnings) != 0)
For this example it's quite trivial, but for a long list of dataframes where the NAs might be hard to spot, this is quite useful.
here is an example:
testit <- function() warning("testit") # function that generates warning.
assign("last.warning", NULL, envir = baseenv()) # clear the previous warning
testit() # run it
if(length(warnings())>0){ # or !is.null(warnings())
print("something happened")
}
maybe this is somehow indirect, but i don't know the more straightforward way.
For a simple TRUE/FALSE return on whether a given operation results in a warning (or error), you could use the is.error function from the berryFunctions package, after first setting options(warn = 2) so that warnings are converted to errors.
E.g.,
options(warn = 2)
berryFunctions::is.error(as.numeric("x")) # TRUE
berryFunctions::is.error(as.numeric("3")) # FALSE
If you want to limit the option change to the use of this function, you could just create a new function as follows.
is.warningorerror <- function(x) {
op <- options()
on.exit(options(op))
options(warn = 2)
berryFunctions::is.error(x)
}
is.warningorerror(as.numeric("x")) # TRUE
options("warn") # still 0 (default)
I personally use the old good sink redirected into a text connection:
# create a new text connection directed into a variable called 'messages'
con <- textConnection("messages","w")
# sink all messages (i.e. warnings and errors) into that connection
sink(con,type = "message")
# a sample warning-generating function
test.fun <- function() {
warning("Your warning.")
return("Regular output.")
}
output <- test.fun()
# close the sink
sink(type="message")
# close the connection
close(con)
# if the word 'Warning' appears in messages than there has been a warning
warns <- paste(messages,collapse=" ")
if(grepl("Warning",warns)) {
print(warns)
}
# [1] "Warning message: In test.fun() : Your warning."
print(output)
# [1] "Regular output."
Possibly more straightforward and cleaner than the other suggested solutions.

Resources