knitr gets tricked by data.table `:=` assignment - r

It seems that knitr doesn't understand that DT[, a:=1] should not result in an output of DT to the document. Is there a way to stop this behaviour?
Example knitr document:
Data.Table Markdown
========================================================
Suppose we make a `data.table` in **R Markdown**
```{r}
DT = data.table(a = rnorm(10))
```
Notice that it doesn't display the contents until we do a
```{r}
DT
```
style command. However, if we want to use `:=` to create another column
```{r}
DT[, c:=5]
```
It would appear that the absence of a equals sign tricks `knitr` into thinking this
is to be printed.
Knitr Output:
Is this a knitr bug or a data.table bug?
EDIT
I have only just noticed, that knitr is being weird when it is echoing the code. Look at the output above. In my source code I have DT[, c:=5] but what knitr renders is
DT[, `:=`(c, 5)]
Weird...
EDIT 2: Caching
Caching also seems to have a problem with := but that must be a different cause, so is a separate question here: why does knitr caching fail for data.table `:=`?

Update Oct 2014. Now in data.table v1.9.5 :
:= no longer prints in knitr for consistency with behaviour at the prompt, #505. Output of a test knit("knitr.Rmd") is now in data.table's unit tests.
and related :
if (TRUE) DT[,LHS:=RHS] now doesn't print (thanks to Jureiss, #869). Test added. To get this to work we've had to live with one downside: if a := is used inside a function with no DT[] before the end of the function, then the next time DT is typed at the prompt, nothing will be printed. A repeated DT will print. To avoid this: include a DT[] after the last := in your function. If that is not possible (e.g., it's not a function you can change) then print(DT) and DT[] at the prompt are guaranteed to print. As before, adding an extra [] on the end of a := query is a recommended idiom to update and then print; e.g. > DT[,foo:=3L][]
Previous answer kept for posterity (the global$depthtrigger business is no longer done as from data.table v1.9.5 so this is no longer true) ...
Just to be clear I understand then: knitr is printing when you don't want it to.
Try increasing data.table:::.global$depthtrigger a little bit at the start of the script.
This will be 3 for you currently :
data.table:::.global$depthtrigger
[1] 3
I don't know how much eval depth knitr adds to the stack. But try changing the trigger to 4 first; i.e.
assign("depthtrigger", 4, data.table:::.global)
and at the end of the knitr script ensure to set it back to 3. If 4 doesn't work, try 5, then 6. If you get to 10 give up and I'll think again. ;-P
Why might this work?
See NEWS from v1.8.4 :
DT[,LHS:=RHS,...] no longer prints DT. This implements #2128 "Try
again to get DT[i,j:=value] to return invisibly". Thanks to discussions here :
how to suppress output when using `:=` in R {data.table}, prior to v1.8.3?
http://r.789695.n4.nabble.com/Avoiding-print-when-using-tp4643076.html
FAQs 2.21 and 2.22 have been updated.
FAQ 2.21 Why does DT[i,col:=value] return the whole of DT? I expected either no visible value (consistent with <-), or a message or return
value containing how many rows were updated. It isn't obvious that the
data has indeed been updated by reference. This has changed in v1.8.3
to meet your expectations. Please upgrade. The whole of DT is returned
(now invisibly) so that compound syntax can work; e.g.,
DT[i,done:=TRUE][,sum(done)]. The number of rows updated is returned
when verbosity is on, either on a per query basis or globally using
options(datatable.verbose=TRUE).
FAQ 2.22 Ok, thanks. What was so difficult about the result of DT[i,col:=value] being returned invisibly? R internally forces
visibility on for [. The value of FunTab's eval column (see
src/main/names.c) for [ is 0 meaning force R_Visible on (see
R-Internals section 1.6). Therefore, when we tried invisible() or
setting R_Visible to 0 directly ourselves, eval in src/main/eval.c
would force it on again. To solve this problem, the key was to stop
trying to stop the print method running after a :=. Instead, inside :=
we now (from v1.8.3) set a global flag which the print method uses to
know whether to actually print or not.
That global flag is data.table:::.global$print. At the top of data.table:::print.data.table you'll see it looking at it. That's because there is no known way to suppress printing from [ (as FAQ 2.22 explains).
So, inside := inside [.data.table it looks to see how "deep" this call is :
if (Cstack_info()[["eval_depth"]] <= .global$depthtrigger) {
suppPrint = function(x) { .global$print=FALSE; x }
# Suppress print when returns ok not on error, bug #2376.
# Thanks to: https://stackoverflow.com/a/13606880/403310
# All appropriate returns following this point are
# wrapped i.e. return(suppPrint(x)).
}
Essential that's just saying: if DT[,x:=y] is running at the prompt, then I know the REPL is going to call the print method on my result, beyond my control. Ok, so given print method is going to run, I'm going to suppress it inside that print method by setting a flag (since the print method that runs (i.e. print.data.table) is something I can control).
In knitr's case it's simulating the REPL in a clever way. It isn't really a script, iiuc, otherwise DT[,x:=y] wouldn't print anyway for that reason. But because it's simulating REPL via an eval there is an extra level of eval depth for code run from knitr. Or something similar (I don't know knitr).
Which is why I'm thinking increasing the depthtrigger might do the trick.
Hacky/crufty, I agree. But if it works, and you let me know which value works, I can change data.table to be knitr aware and change the depthtrigger automatically. Or any better solutions are most welcome.

Why not just use:
```{r, results='hide'}
DT[, c:=5]
```

For anyone returning to this in 2017 with RMarkdown 1.3 and data.table 1.10 or similar, there was a resurgence of this bug, as identified and documented here
This was subsequently fixed in RMarkdown 1.4

Just surround the expression with invisible(). This works for me.

I've run across the same problem and I solved it fairly easy by re-assigning the variable. In your case:
DT <- DT[, ':=' (c, 5)]
It's a bit more verbose though, especially if the variable name is big.

Related

utils::globalVariables(.) not applicable to R CMD CHECK note:no visible binding for global variable '.' [duplicate]

I noticed in checking a package that I obtain notes "no visible binding for global variable" when I use functions like subset that use verbatim names of list elements as arguments.
For example with a data frame:
foo <- data.frame(a=c(TRUE,FALSE,TRUE),b=1:3)
I can do silly things like:
subset(foo,a)
transform(foo,a=b)
Which work as expected. The R code check in R CMD however doesn't understand that these refer to elements and complains about there not being any visible bindings of global variables.
While this works ok, I don't really like having notes in my package and prefer for it to pass the check with no errors, warnings and notes at all. I also don't really want to rework my code too much. Is there a way to write these codes so that it is clear the arguments do not refer to global variables?
To get it past R CMD check you can either :
Use get("b") (but that is onerous)
Place a=b=NULL somewhere higher up in your function (that's what I do)
There was a thread on r-devel a while ago where somebody from r-core basically said (from memory) "NOTES are ok, you know. The assumption is that the author checked it and is ok with the NOTE.". But, I agree with you. I do prefer to have CRAN checks return a clean "OK" on all platforms. That way the user is left in no doubt that it passes checks ok.
EDIT :
Here is the r-devel thread I was remembering (from April 2010). So that appears to suggest that there are some situations where there is no known way to avoid the NOTE, but that's ok.
This is one of the potential "unanticipated consequences" of using subset non-interactively. As it says in the Warning section of ?subset:
This is a convenience function intended for use interactively. For
programming it is better to use the standard subsetting functions like
‘[’, and in particular the non-standard evaluation of argument
‘subset’ can have unanticipated consequences.
From R version 2.15.1 onwards there is a way around this:
if(getRversion() >= "2.15.1") utils::globalVariables(c("a", "othervar"))
As per the warning section of ?subset it is better to use subset interactively, and [ for programming.
I would replace a command like
subset(foo,a)
with
foo[foo$a]
or if foo is a dataframe:
foo[foo$a, ]
you might also like to use with if foo is a dataframe and the expression to be evaluated is complex:
with(foo, foo[a, ])
I had this issue and traced it to my ggplot2 section.
This code provided the error:
ggplot2::ggplot(data = spec.df, ggplot2::aes(E.avg, fraction)) +
ggplot2::geom_line() +
ggplot2::ggtitle(paste0(title))
Adding the data name to the parameters eliminated the not:
ggplot2::ggplot(data = spec.df, ggplot2::aes(spec.df$E.avg, spec.df$fraction)) +
ggplot2::geom_line() +
ggplot2::ggtitle(paste0(title))

Suppress all output from the compute.es functions

I'm using the compute.es package (http://cran.r-project.org/web/packages/compute.es/compute.es.pdf) to compute effect sizes. Now, when using one of the functions from this package, the result is printed even though you assign it to a vector, and I would like to surpress this.
For example,
library("compute.es")
mes(5,5,5,5,5,5,level=95,dig=2,id=NULL,data=NULL)
prints a lot of information. By using capture.output like so
library("compute.es")
capture.output(mes(5,5,5,5,5,5,level=95,dig=2,id=NULL,data=NULL))
a lot of it gets suppressed, but not all. I've had no luck with sink() (which breaks the whole function) or invisible() either.
How can I suppress all printed information from this function?
Version 0.2-4 of the compute.es package has a 'verbose' argument, so e.g.:
require(compute.es) # VERSION => 0.2-4
des(.3, 30, 30, verbose=FALSE) # WILL SUPPRESS PRINTING TO CONSOLE
This function is really bi-polar. Some things are printed using cat, others using message. In addition to what you've tried you can also try suppressMessages.
This worked for me.
x <- capture.output(suppressMessages(mes(5,5,5,5,5,5,level=95,dig=2,id=NULL,data=NULL)))
Alternatively, you can hack the function (use the source!) and cut out all the cat and message statements. Another way would be to add another argument to the function (like verbose) and turn on/off messages by putting them inside an if clause. E.g.
if (!is.null(data)) {
if (verbose) {
cat("\n")
message(" EFFECT SIZE CALCULATION (FOR VECTOR INPUT)")
cat("\n")
}
...

Problems with reassignInPackage

I am trying to understand the way the YourCast R package works and make it work with my data.
For example, if a function produces errors, I
get the source code of that function using YourCast:::bad.fn
add outputs of critical
values at critical stages
use reassignInPackage(name="original.fn", package="YourCast", value="my.fn")
Once I find the cause of the error, I fix it in the function and reassign it in the package.
However, for some strange reason this does not work for non-hidden functions.
For example:
install.packages("YourCast")
Library(YourCast)
YourCast:::check.depvar
This will print the hidden function check.depvar. One line if (all(ix == 1:3)) will produce an error message if any of the x is missing.
Thus, I change the whole function to the following and replace the original formula:
mzuba.check.depvar <- function(formula)
{
return (grepl("log[(]",as.character(formula)[2]))
}
reassignInPackage("check.depvar",
pkgName="YourCast",
mzuba.check.depvar)
rm(mzuba.check.depvar)
Now YourCast:::check.depvar will print my version of that function, and everything is fine.
However
YourCast::yourcast or YourCast:::yourcast or simply yourcast will print the non-hidden function yourcast. Suppose I want to change that function as well.
reassignInPackage(name="yourcast",
pkgName="YourCast",
value=test)
Now, YourCast::yourcast and YourCast:::yourcast will print the new, modified version but yourcast still gives the old version!
That might not a problem if I could simply call YourCast::yourcast instead of yourcast, but that produces some kind of error that I can't trace back because suddenly R-Studio does not print error messages at all anymore!, although it still does something if it is capable to:
> Uagh! do something!
> 1 + 1
[1] 2
> Why no error msg?
>
Restarting the R-session will solve the error-msg problem, though.
So my question is: How do I reassign non-hidden functions in packages?
Furthermore (this would faciliate testing a lot), is there a way to make all hidden functions available without using the ::: operator? I.e., How to export all functions from a package?

In R, is it possible to suppress "Note: no visible binding for global variable"?

I'm wondering if its possible to suppress these outputs in R which are cluttering up the console:
Note: no visible binding for global variable '.->ConfigString'
Note: no visible binding for '<<-' assignment to 'ConfigString'
Here is the code (its a simple ReferenceClass to store configuration for an R project):
# Reference Class to store configuration
Config <- setRefClass("Config",
fields = list(
ConfigString = "character"
),
methods = list(
# Constructor
initialize = function() {
ConfigString <<- "Hello, World!"
}
)
)
What I have tried so far
I've tried ever combination and permutation of predefining the variables, pre-setting them to null, etc, but R is still stubbornly printing hundreds of "No Visible Binding" notes in my source code.
Is anyone wiser than I when it comes to the internals of R?
Update 1
I've tried changing Config <- to Config <<-, and that gets rid of the second extraneous note. The first extraneous note is still present, however.
Update 2
I'm beginning to lose heart, even sample code by John Chambers generates more of these horrible, extraneous notes.
Update 3
These notes occur in Revolution R v7.0, but don't occur in RStudio. It appears as if Revolution R v7.0 is calling R CMD check, which is normally only used when preparing packages, so these notes can safely be ignored.
Update 4
Hadley Wickhams code also generates these notes. Apparently, it is possible to eliminate them using utils::globalVariables, however, this doesn't seem to work on the newer ReferenceClasses. Even if it were at all possible to use them, Hadley states:
globalVariables is a hideous hack and I will never use it.
All credit to #Tyler Rinker for this answer.
To eliminate these notes, prefix the source code above with this:
# Intent:
# This function suppresses the following notes generated by "R CMD check":
# - "Note: no visible binding for global variable '.->ConfigString'"
# - "Note: no visible binding for '<<-' assignment to 'ConfigString'"
# Usage:
# Add the following right in the beginning of the .r file (before the Reference
# class is defined in the sourced .r file):
# suppressBindingNotes(c(".->ConfigString","ConfigString"))
suppressBindingNotes <- function(variablesMentionedInNotes) {
for(variable in variablesMentionedInNotes) {
assign(variable,NULL, envir = .GlobalEnv)
}
}
suppressBindingNotes(c(".->ConfigString","ConfigString"))
In addition, sometimes Revolution R might need to be restarted if it has been running for a long time.
You can try this command.
compiler::setCompilerOptions(suppressAll = TRUE)
This works for me to suppress the messages like
Note: no visible binding for global variable ...
Note: no visible binding for global function definition ...

No visible binding for global variable Note in R CMD check

I noticed in checking a package that I obtain notes "no visible binding for global variable" when I use functions like subset that use verbatim names of list elements as arguments.
For example with a data frame:
foo <- data.frame(a=c(TRUE,FALSE,TRUE),b=1:3)
I can do silly things like:
subset(foo,a)
transform(foo,a=b)
Which work as expected. The R code check in R CMD however doesn't understand that these refer to elements and complains about there not being any visible bindings of global variables.
While this works ok, I don't really like having notes in my package and prefer for it to pass the check with no errors, warnings and notes at all. I also don't really want to rework my code too much. Is there a way to write these codes so that it is clear the arguments do not refer to global variables?
To get it past R CMD check you can either :
Use get("b") (but that is onerous)
Place a=b=NULL somewhere higher up in your function (that's what I do)
There was a thread on r-devel a while ago where somebody from r-core basically said (from memory) "NOTES are ok, you know. The assumption is that the author checked it and is ok with the NOTE.". But, I agree with you. I do prefer to have CRAN checks return a clean "OK" on all platforms. That way the user is left in no doubt that it passes checks ok.
EDIT :
Here is the r-devel thread I was remembering (from April 2010). So that appears to suggest that there are some situations where there is no known way to avoid the NOTE, but that's ok.
This is one of the potential "unanticipated consequences" of using subset non-interactively. As it says in the Warning section of ?subset:
This is a convenience function intended for use interactively. For
programming it is better to use the standard subsetting functions like
‘[’, and in particular the non-standard evaluation of argument
‘subset’ can have unanticipated consequences.
From R version 2.15.1 onwards there is a way around this:
if(getRversion() >= "2.15.1") utils::globalVariables(c("a", "othervar"))
As per the warning section of ?subset it is better to use subset interactively, and [ for programming.
I would replace a command like
subset(foo,a)
with
foo[foo$a]
or if foo is a dataframe:
foo[foo$a, ]
you might also like to use with if foo is a dataframe and the expression to be evaluated is complex:
with(foo, foo[a, ])
I had this issue and traced it to my ggplot2 section.
This code provided the error:
ggplot2::ggplot(data = spec.df, ggplot2::aes(E.avg, fraction)) +
ggplot2::geom_line() +
ggplot2::ggtitle(paste0(title))
Adding the data name to the parameters eliminated the not:
ggplot2::ggplot(data = spec.df, ggplot2::aes(spec.df$E.avg, spec.df$fraction)) +
ggplot2::geom_line() +
ggplot2::ggtitle(paste0(title))

Resources