implementation of dot function `.()` in data.table package [duplicate] - r

This question already has an answer here:
How is dot (.) alias for list constructor implemented in data.table package?
(1 answer)
Closed 4 years ago.
from ?data.table::data.table :
The expression '.()' is a shorthand alias to list(); they both mean
the same
However this function is nowhere to be found :
data.table:::.
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
object '.' not found
So I suppose the input is parsed somehow, how is it done ? I'd like to use the same feature in my own package.
The following works not too bad :
test <- function(x) {
eval(substitute(
eval.parent(substitute(x, list(.=list)))
))
}
foo <- "bar"
test(.(foo))
# [[1]]
# [1] "bar"
identical(test(.(foo)), list(foo))
# [1] TRUE
However there will be some dot variables used inside this dot function, and this fails :
. <- "baz"
test(.(foo,.))
# [[1]]
# [1] "bar"
#
# [[2]]
# function (...) .Primitive("list")
Expected :
# [[1]]
# [1] "bar"
#
# [[2]]
# [1] "baz"

The data.table package accomplishes it with this bit of code
replace_dot_alias <- function(e) {
# we don't just simply alias .=list because i) list is a primitive (faster to iterate) and ii) we test for use
# of "list" in several places so it saves having to remember to write "." || "list" in those places
if (is.call(e)) {
# . alias also used within bquote, #1912
if (e[[1L]] == 'bquote') return(e)
if (e[[1L]] == ".") e[[1L]] = quote(list)
for (i in seq_along(e)[-1L]) if (!is.null(e[[i]])) e[[i]] = replace_dot_alias(e[[i]])
}
e
}
found in R/data.table.R (currently at line 173). That's why you don't find data.table:::. anywhere, and how they accomplish the parsing you mention in your post.
Then in [.data.table" <- function (x, i, j,... they can do this sort of thing
if (!missing(j)) {
jsub = replace_dot_alias(substitute(j))
root = if (is.call(jsub)) as.character(jsub[[1L]])[1L] else ""
....

Related

get name of an object inside a loop

I know the deparse+substitute trick to get the name from an object passed as argument to a function, but the same trick inside a loop does no work.
My code (just for testing):
mylist <- list(first = c("lawyer","janitor"), second = c("engineer","housewife"))
for (element in names(mylist)){
print(deparse(substitute(mylist[[element]])))
}
[1] "mylist[[element]]"
[1] "mylist[[element]]"
is there any way of getting the result?:
first
second
using lapply
lapply(mylist, function(x) { print(names(x))} )
# NULL
# NULL
# $first
# NULL
#
# $second
# NULL
using for loop as per your question
for (element in names(mylist)){
print(element)
}
# [1] "first"
# [1] "second"
Use "names"
for (element in names(mylist)){
print(as.name(element))
}

What's the difference between setMethod("$<-") and set setReplaceMethod("$")?

Question
When programming in r with the s4 OOP system, when one have to use setReplaceMethod? I don't see what's the difference with the setMethod when adding <- to the name of the function. Does setMethod("$<-") and setReplaceMethod("$") are equal?
Documentation
I didn't find anything in the documentation with ?setReplaceMethod or ??setReplaceMethod. There is nothing except the usage.
In StackOverflow, there is several questions about setReplaceMethod but none helping. I started to search a answer to this question when I saw it seem's is not possible to use roxygen2 to document methods created with setReplaceMethod
I didn't find anything by searching in r-project.org
Reproductible example
library(methods)
# Create a class
setClass("TestClass", slots = list("slot_one" = "character"))
# Test with setMethod -----------------------
setMethod(f = "$<-", signature = "TestClass",
definition = function(x, name, value) {
if (name == "slot_one") x#slot_one <- as.character(value)
else stop("There is no slot called",name)
return(x)
}
)
# [1] "$<-"
test1 <- new("TestClass")
test1$slot_one <- 1
test1
# An object of class "TestClass"
# Slot "slot_one":
# [1] "1"
# Use setReplaceMethod instead -----------------------
setReplaceMethod(f = "$", signature = "TestClass",
definition = function(x, name, value) {
if (name == "slot_one") x#slot_one <- as.character(value)
else stop("There is no slot called",name)
return(x)
}
)
# An object of class "TestClass"
# Slot "slot_one":
# [1] "1"
test2 <- new("TestClass")
test2$slot_one <- 1
test2
# [1] "$<-"
# See if identical
identical(test1, test2)
# [1] TRUE
Actual conclusion
setReplaceMethod seems only to permit to avoid the <- when creating a set method. Because roxygen2 can't document methods produced with, it's better for the moment to use setMethod. Does I have right?
Here's the definition of setReplaceMethod
> setReplaceMethod
function (f, ..., where = topenv(parent.frame()))
setMethod(paste0(f, "<-"), ..., where = where)
<bytecode: 0x435e9d0>
<environment: namespace:methods>
It is pasting a "<-" on to the name, so is functionally equivalent to setMethod("$<-"). setReplaceMethod conveys more semantic meaning.

Methods for recursive concatenations on user defined classes in R?

All the concatenation functions in R can be rewritten as recursive functions. For instance, I could use c as a binary operator and I could define a new concat function as
concat <- function(...) {
Reduce(c, ...)
}
and concat would function as c actually functions in R base.
R syntactically uses many such functions, for instance c for vectors and lists, cbind for arrays, data.frames and matrices. When defining new object classes, it makes sense to create a method for combining them using a function which takes ... as an argument.
I know R can match methods to objects when they are the first object in the argument list, but what if I define a method like
concat <- function(...) {
UseMethod('concat')
}
concat.numeric <- function(...) {
c(...)
}
concat.character <- function(...) {
c(...)
}
myCon <- function(charPart, numPart) {
out <- list(charPart=charPart, numPart=numPart)
class(out) <- "myClass"
out
}
concat.myClass <- function(...) {
myCon(sapply(..., `[[`, 'charPart'), sapply(..., `[[`, 'numPart'))
}
> concat(4, 6, 'a')
"4" "6" "a"
> myObj1 <- myCon('a', 1)
> myObj2 <- myCon('b', 2)
> concat(myObj1, myObj2)
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'p' of mode 'function' was not found
At what point does R identify the types of arguments supplied to concat? How can I convince R to attempt to cast arguments to concat up to my specific object class?
You're not quite passing what you think to sapply you need to put it in list(...) so sapply can iterate through the elements rather than parsing them as extra arguments in the wrong place.
concat.myClass <- function(...)
{
myCon(sapply(list(...), `[[`, 'charPart'), sapply(list(...), `[[`, 'numPart'))
}
> myObj1
$charPart
[1] "a"
$numPart
[1] 1
attr(,"class")
[1] "myClass"
> myObj2
$charPart
[1] "b"
$numPart
[1] 2
attr(,"class")
[1] "myClass"
this then gives:
> concat(myObj1, myObj2)
$charPart
[1] "a" "b"
$numPart
[1] 1 2
attr(,"class")
[1] "myClass"
.. which I presume is what you want???

R - store functions in a data.frame

I would like to return a matrix/data.frame each row containing arguments and the content of a file.
However, there may be many files, so I would prefer if I could load the file lazily, so the file is only read if the actual content is requested. The function below loads the files actively if as.func=F.
It would be perfect if it could load them lazily, but it would also be acceptable, if instead of the content a function is returned that would read the content.
I can make functions that read the content (see below with as.func=T), but for some reason I cannot put that into the data.frame to return.
load_parallel_results <- function(resdir,as.func=F) {
## Find files called .../stdout
stdoutnames <- list.files(path=resdir, pattern="stdout", recursive=T);
## Find files called .../stderr
stderrnames <- list.files(path=resdir, pattern="stderr", recursive=T);
if(as.func) {
## Create functions to read them
stdoutcontents <-
lapply(stdoutnames, function(x) { force(x); return(function() { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } ) } );
stderrcontents <-
lapply(stderrnames, function(x) { force(x); return(function() { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } ) } );
} else {
## Read them
stdoutcontents <-
lapply(stdoutnames, function(x) { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } );
stderrcontents <-
lapply(stderrnames, function(x) { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } );
}
if(length(stdoutnames) == 0) {
## Return empty data frame if no files found
return(data.frame());
}
## Make the columns containing the variable values
m <- matrix(unlist(strsplit(stdoutnames, "/")),nrow = length(stdoutnames),byrow=T);
mm <- as.data.frame(m[,c(F,T)]);
## Append the stdout and stderr column
mmm <- cbind(mm,unlist(stdoutcontents),unlist(stderrcontents));
colnames(mmm) <- c(strsplit(stdoutnames[1],"/")[[1]][c(T,F)],"stderr");
## Example:
## parallel --results my/res/dir --header : 'echo {};seq {myvar1}' ::: myvar1 1 2 ::: myvar2 A B
## > load_parallel_results("my/res/dir")
## myvar1 myvar2 stdout stderr
## [1,] "1" "A" "1 A\n1" ""
## [2,] "1" "B" "1 B\n1" ""
## [3,] "2" "A" "2 A\n1\n2" ""
## [4,] "2" "B" "2 B\n1\n2" ""
return(mmm);
}
Background
GNU Parallel has a --results option that stores output in a structured way. If there are 1000000 outputfiles it may be hard to manage them. R is good for that, but it would be awfully slow if you had to read all 1000000 files just to get the ones where argument 1 = "Foo" and argument 2 = "Bar".
Unfortunately I don't think you can save a function in a data.frame column.
But you could store the deparsed text of the function and evaluate it when needed:
e.g.
myFunc <- function(x) { print(x) }
# convert the function to text
funcAsText <- deparse(myFunc)
# convert the text back to a function
newMyFunc <- eval(parse(text=funcAsText))
# now you can use the function newMyFunc exactly like myFunc
newMyFunc("foo")
> [1] "foo"
EDIT:
Since the files are a lot, I suggest you to simply store a string indicating the type of the file and create a function that understands the types and reads the file accordingly; so you can call it when needed by passing the type and filepath.
(Without reading the question body:)
You can store functions in a data.frame like this:
df <- data.frame(fun = 1:3)
df$fun <- c(mean, sd, function(x) x^2)
I am not sure if this will break other things, so consider using tibble or data.table from the same named packages which really support arbitrary object types.
You can use 2D lists to store your functions. Obviously, you lose some of the checks you get with DFs, but that's the whole point here:
> funs <- c(replicate(5, function(x) NULL), replicate(5, function(y) TRUE))
> names <- as.list(letters[1:10])
> # df doesn't work
> df <- data.frame(names=names)
> df.2 <- cbind(df, funs)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ""function"" to a data.frame
# but 2d lists do
> lst.2d <- cbind(funs, names)
> lst.2d[2, 1]
$funs
function (x)
NULL
> lst.2d[6, 1]
$funs
function (y)
TRUE

R: promise already under evaluation

I understand that you are probably sick and tired of answering the same question again, but I am still getting the error discussed in several other questions:
promise already under evaluation: recursive default argument reference or earlier problems?
even though I did follow the "cumbersome" advice of prepending ".":
show.large.objects.threshold <- 100000
show.large.objects.exclude <- c("closure")
show.large.objects <- function (.envir = sys.frame(),
threshold = show.large.objects.threshold,
exclude = show.large.objects.exclude) {
for (n in print(ls(.envir, all.names = TRUE))) tryCatch({
o <- get(n,envir = .envir)
s <- object.size(o)
if (s > threshold && !(typeof(o) %in% exclude)) {
cat(n,": ")
print(s,units="auto")
}
}, error = function(e) { cat("n=",n,"\n"); print(e) })
}
show.large.objects.stack <- function (.threshold = show.large.objects.threshold,
skip.levels = 1,# do not examine the last level - this function
.exclude = show.large.objects.exclude) {
for (level in 1:(sys.nframe()-skip.levels)) {
cat("*** show.large.objects.stack(",level,") ")
print(sys.call(level))
show.large.objects(.envir = sys.frame(level), threshold = .threshold, exclude = .exclude)
}
}
but I still get errors:
> f <- function () { c <- 1:1e7; d <- 1:1e6; print(system.time(show.large.objects.stack())) }
> f()
*** show.large.objects.stack( 1 ) f()
[1] "c" "d"
c : 38.1 Mb
d : 3.8 Mb
*** show.large.objects.stack( 2 ) print(system.time(show.large.objects.stack()))
[1] "..." "x"
n= ...
<simpleError in get(n, envir = .envir): argument "..." is missing, with no default>
n= x
<simpleError in get(n, envir = .envir): promise already under evaluation: recursive default argument reference or earlier problems?>
*** show.large.objects.stack( 3 ) system.time(show.large.objects.stack())
[1] "expr" "gcFirst" "ppt" "time"
n= expr
<simpleError in get(n, envir = .envir): promise already under evaluation: recursive default argument reference or earlier problems?>
user system elapsed
0 (0.00ms) 0 (0.00ms) 0.002 (2.00ms)
So, what am I still doing wrong?
Do I really need the . in .envir? What about .exclude and .threshold?
Why do I get the argument "..." is missing, with no default error?
Why do I get the promise already under evaluation error?
Thanks!
When f is called a stack of 5 levels is built down to show.large.objects, which starts to evaluate the contents of the frames starting from the top.
f
-> print
-> system.time
-> show.large.objects.stack
-> show.large.objects
Level 1
f()
Everything ok here.
Level 2
print(system.time(show.large.objects.stack()))
When you call ls(.envir, all.names) on its frame you get
[1] "..." "x"
of which ... is missing and throws error 3 when you call get on it, and x = system.time(show.large.objects.stack()) is currently being evaluated and throws error 4.
Level 3
system.time(show.large.objects.stack())
whose ls gives you
[1] "expr" "gcFirst" "ppt" "time"
of which expr = show.large.objects.stack() is still currently being evaluated and throws another of error 4.
Level 4
show.large.objects.stack()
whose ls contain no sketchy things and completes without errors.
Bottom line
show.large.frames() must be evalutad on its own, not as an argument to any function, or it will throw errors. Why not letting it do the printing itself?
I found this very helpful
> debug(show.large.objects)
> f()
Browse[2]> lapply(sys.frames(), ls)
[[1]]
[1] "c" "d"
[[2]]
[1] "x"
[[3]]
[1] "expr" "gcFirst" "ppt" "time"
[[4]]
[1] "level" "skip.levels"
[[5]]
[1] "exclude" "threshold"

Resources