I defined some S4 matrices, whose class is lazyMatrix. If M is such a matrix, I want to define M[2, ] as the second row of M, and M[2] as the second coefficient of M (when enumerating the coefficients column after column).
So I defined these two S4 methods:
setMethod( # to extract a coefficient
"[",
signature("lazyMatrix", i = "numeric"),
function(x, i) {
......
}
)
setMethod( # to extract a row
"[",
signature("lazyMatrix", i = "numeric", j = "missing", drop = "ANY"),
function(x, i, j, drop) {
......
}
)
But both M[2, ] and M[2] return the second row of M. I tried to exchange the order of the two method definitions, that does not change anything.
I found a solution in the source code of the onion package:
setMethod(
"[",
signature("lazyMatrix", i = "numeric", j = "missing", drop = "missing"),
function(x, i, j, ..., drop) {
n_args <- nargs()
if(n_args == 3L) { # M[i, ]
......
} else if(n_args == 2L) { # M[i]
......
} else {
stop("Invalid arguments in subsetting.")
}
}
)
setMethod(
"[",
signature("lazyMatrix", i = "numeric", j = "missing", drop = "ANY"),
function(x, i, j, ..., drop) {
n_args <- nargs()
if(n_args == 4L) { # M[i, ]
......
} else if(n_args == 3L) { # M[i]
......
} else {
stop("Invalid arguments in subsetting.")
}
}
)
Before finding this solution, I tried using nargs() in my attempts, unsuccessfully. The trick is to add the ... argument. But I don't understand how this works. Please leave another answer or a comment if you can explain.
The dispatch issue has nothing to do with the dots, though it is problematic that the formal arguments of your methods do not match those of the generic function:
> getGeneric("[")
standardGeneric for "[" defined from package "base"
function (x, i, j, ..., drop = TRUE)
standardGeneric("[", .Primitive("["))
<bytecode: 0x1403d1a28>
<environment: 0x1403c9d10>
Methods may be defined for arguments: x, i, j, drop
Use showMethods([) for currently available ones.
?setMethod is quite clear about that:
The definition must be a function with the same formal arguments as the generic; however, setMethod() will handle methods that add arguments, if ... is a formal argument to the generic. See the Details section.
Anyway, the real issue is that setMethod interprets your first signature
signature("lazyMatrix", i = "numeric")
as
signature("lazyMatrix", i = "numeric", j = "ANY", drop = "ANY")
and both of the calls M[2] and M[2, ] match your second signature
signature("lazyMatrix", i = "numeric", j = "missing", drop = "ANY")
more closely than the first (because "missing" is more specific than "ANY"). Hence dispatch in both cases favours the second method.
It is still true that the primitive function [ is the trickiest generic function out there, due to all of the special cases handled in C code for traditional matrices (of implicit class matrix). The nargs() approach is correct and also used by package Matrix, which is really the canonical reference for this kind of thing:
> library(Matrix)
> selectMethod("[", signature = c("Matrix", "numeric", "missing", "missing"))
Method Definition:
function (x, i, j, ..., drop = TRUE)
{
Matrix.msg("M[i,m,m] : nargs()=", nargs(), .M.level = 2)
if (nargs() == 2) {
.M.vectorSub(x, i)
}
else {
callGeneric(x, i = i, , drop = TRUE)
}
}
<bytecode: 0x128e94a90>
<environment: namespace:Matrix>
Signatures:
x i j drop
target "Matrix" "numeric" "missing" "missing"
defined "Matrix" "index" "missing" "missing"
Related
I'm able to successfully modify the behaviour of [.data.frame, but fail to do so for [.data.table.
For data.frame:
# Exact same signature as "[.data.frame" :
"[.my.data.frame" <- function (x, i, j,
drop = if (missing(i)) TRUE
else length(cols) == 1) {
if(!missing(j) && j==8 ) {
cat("Oy vey\n")
}
NextMethod()
}
df <- data.frame(a=1,b=2)
class(df) <- c("my.data.frame", class(df))
# Works as expected:
df[1,2] # 2
df[1,8] # Oy Vey NULL
df[1,] # 1 2
However, for (the considerably more complicated) data.table:
# Exact same signature as "[.data.table" :
"[.my.data.table" <- function (x, i, j, by, keyby, with = TRUE, nomatch = getOption("datatable.nomatch"),
mult = "all", roll = FALSE,
rollends = if (roll == "nearest") c(TRUE, TRUE)
else if (roll >= 0) c(FALSE, TRUE) else c(TRUE, FALSE),
which = FALSE, .SDcols, verbose = getOption("datatable.verbose"),
allow.cartesian = getOption("datatable.allow.cartesian"),
drop = NULL, on = NULL) {
if(!missing(j) && j==8 ) {
cat("Oy vey\n")
}
NextMethod()
}
dt <- data.table(a=1,b=2)
class(dt) <- c("my.data.table", class(dt))
dt[1,2] # ERROR: i is not found in calling scope and it is not a column of type logical. When the first argument inside DT[...] is a single symbol, data.table looks for it in calling scope.
I know better than to pass arguments to NextMethod. It looks like I must call [.data.table explicitly, capture and pass the arguments as unevaluated promises - but all my attempts with quote, substitute or match.call have so far failed. Any insight would be appreciated.
I've found a partial solution, posting here in hope someone might improve on it.
"[.my.data.table" <- function (x, ...) {
# Modifications and tests galore - which can be tricky with this signature
class(x) <- class(x)[-1]
ret <- x[...]
class(x) <- c("my.data.table", class(x))
ret
}
I still consider this partial, because actually doing something in the function probably involves at least something like arglist <- list(...), and this fails when [ is called like this -
dt[1,]
Other directions are still very welcome.
I am pretty new to R, I have coded with Python and here OOP is quite different to python. I am trying to understand it, so in S3 you can create methods/functions that are not directly attached to a single class, just the same as the objects as they can be in multiple classes (which is quite flexible I guess). However what I do not understand is when I am creating a class such as:
> my_mean <- function (x, ...) {
UseMethod("my_mean", x)}
> my_mean
function (x, ...) {
UseMethod("my_mean", x)}
> my_mean.default <- function(obj){cat("this is a generic function")}
> my_mean.default
function(obj){cat("this is a generic function")}
But then when I have to run for example summary:
summary.default
function (object, ..., digits, quantile.type = 7)
{
if (is.factor(object))
return(summary.factor(object, ...))
else if (is.matrix(object)) {
if (missing(digits))
return(summary.matrix(object, quantile.type = quantile.type,
...))
else return(summary.matrix(object, digits = digits, quantile.type = quantile.type,
...))
}
value <- if (is.logical(object))
c(Mode = "logical", {
tb <- table(object, exclude = NULL, useNA = "ifany")
if (!is.null(n <- dimnames(tb)[[1L]]) && any(iN <- is.na(n))) dimnames(tb)[[1L]][iN] <- "NA's"
tb
})
else if (is.numeric(object)) {
nas <- is.na(object)
object <- object[!nas]
qq <- stats::quantile(object, names = FALSE, type = quantile.type)
qq <- c(qq[1L:3L], mean(object), qq[4L:5L])
if (!missing(digits))
qq <- signif(qq, digits)
names(qq) <- c("Min.", "1st Qu.", "Median",
"Mean", "3rd Qu.", "Max.")
if (any(nas))
c(qq, `NA's` = sum(nas))
else qq
}
else if (is.recursive(object) && !is.language(object) &&
(n <- length(object))) {
sumry <- array("", c(n, 3L), list(names(object),
c("Length", "Class", "Mode")))
ll <- numeric(n)
for (i in 1L:n) {
ii <- object[[i]]
ll[i] <- length(ii)
cls <- oldClass(ii)
sumry[i, 2L] <- if (length(cls))
cls[1L]
else "-none-"
sumry[i, 3L] <- mode(ii)
}
sumry[, 1L] <- format(as.integer(ll))
sumry
}
else c(Length = length(object), Class = class(object), Mode = mode(object))
class(value) <- c("summaryDefault", "table")
value
}
<bytecode: 0x000001926eaaf8f8>
<environment: namespace:base>
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x000001926e9ec2c0>
<environment: namespace:base>
I cannot see the difference in why when you call summary in the console it does not give you the function, it gives you a reference to that object. There's any explanation? Furthermore, is it generic in some way similar to init?
S3 classes work nothing like any OOP you may be familiar with from other languages. They are a losely connected set of mechanisms that only work when you stick to certain rules.
x <- 1:11
mean(x)
#> [1] 6
This implcitely calls the function mean.default because x is a simple atomic vector.
Now we create a method for our own class evil
mean.evil <- function( x ) {
return(666) # always retuns 666 that is why it is evil
}
And we convert the vector x to a class evil:
class(x) <- "evil" # you can actually do it just like that
Now, calling mean determines that xis of class evil and calls the according function.
mean(x) # calls mean.evil
#> [1] 666
mean.default(x) # coerces R to use the default method which is still possible
#> [1] 6
The reason is that mean uses UseMethod() which checks the class and tries to find a function that has name with the pattern mean.[myclass]. And that is all that happens.
mean
#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x0000000015812e18>
#> <environment: namespace:base>
In other languages everything is held together by the syntax. S3 mechanisms on the other hand can be used to "approximate" OOP but they can be easily misused. They are simply and effective and appropriate for many use cases in R. If you are interested in more advanced OOP in R I recommend R6 classes.
Created on 2020-06-30 by the reprex package (v0.3.0)
I'm currently doing Advanced-R, 18 Expressions.
Topic is about 18.5.2 Finding all variables created by assignment, but the given code doesn't work in the case of pairlist.
I followed all the given codes, but the results are not quite same with what I expect.
To begin with, in order to figure out what the type of the input, expr_type() is needed.
expr_type <- function(x) {
if(rlang::is_syntactic_literal(x)) {
"constant"
} else if (is.symbol(x)) {
"symbol"
} else if (is.call(x)) {
"call"
} else if (is.pairlist(x)) {
"pairlist"
} else {
typeof(x)
}
}
And the author, hadley, coupled this with a wrapper around the switch function.
switch_expr <- function(x, ...) {
switch(expr_type(x),
...,
stop("Don't know how to handle type ", typeof(x), call. = FALSE)
)
}
In the case of base cases, symbol and constant, is trivial because neither represents assignment.
find_assign_rec <- function(x) {
switch_expr(x,
constant = ,
symbol = character()
)
}
In the case of recursive cases, especially for pairlists, he suggested
flat_map_chr <- function(.x, .f, ...) {
purrr::flatten_chr(purrr::map(.x, .f, ...))
}
So summing up, it follows
find_assign_rec <- function(x) {
switch_expr(x,
# Base cases
constant = ,
symbol = character(),
# Recursive cases
pairlist = flat_map_chr(as.list(x), find_assign_rec),
)
}
find_assign <- function(x) find_assign_rec(enexpr(x))
Then, I expect in the case of pl <- pairlist(x = 1, y = 2), find_assign(pl) should return #> [1] "x" "y"
But the actual output is character(0)
What is wrong with this?
I would like to redefine the meaning of the subset operator [ to work with (x, y) coordinates instead of the (i, j) matrix index system. I'd like this to work on matrices (behind the scenes).
I thought initially of creating an S4 class based on the matrix type. And then use the generic function [ and create a method for it that would do sub-setting with i and j swapped. But I realized now I can't change the signature of the generic:
> getGeneric("[")
standardGeneric for "[" defined from package "base"
function (x, i, j, ..., drop = TRUE)
standardGeneric("[", .Primitive("["))
<bytecode: 0x55f8eef68ad8>
<environment: 0x55f8eef5f310>
Methods may be defined for arguments: x, i, j, drop
Use showMethods("[") for currently available ones.
So is there a way of doing this? I am thinking of rewriting the definition of [ but that feels heretic... So, some advice is greatly appreciated!
Here's some pseudo R code that reflects my thoughts on how I could redefine [. Do you anticipate trouble ahead if I go forward with this approach?
setClass(
"plane",
slots = c(
type = "character",
xlen = "integer",
ylen = "integer",
boundary = "character",
lattice = "matrix"
)
)
`[` <- function(obj, x, y, ..., drop = TRUE) {
if (is(obj, 'plane')) {
lattice <- base::`[`(x = obj#lattice, i = y, j = x, ..., drop = FALSE)
xlen <- length(x)
ylen <- length(y)
new_plane <- plane(type = obj#type,
xlen = xlen,
ylen = ylen,
boundary = obj#boundary)
new_plane#lattice <- lattice
return(new_plane)
}
else
return(base::`[`(x = obj, i = x, j = y, ..., drop = drop))
}
I would like to write a [. method for my ReferenceClass. So far, I have something like this:
DT <- data.table(Index=1:5)
MySeries <- setRefClass("MySeries", fields = list(data="data.table"))
setMethod("[","MySeries",function(x, i,j,drop) {
ii <- substitute(i)
x$data <- x$data[eval(ii)]
return(x)
})
S <- MySeries(data=DT)
... but it throws an error when I finally call S[Index>3]. How to fix the above to get this expected result?
Index
4: 4
5: 5
This is really about the use of eval(substitute()) as much as about S4 methods. Here is the generic that you are interested in
> getGeneric("[")
standardGeneric for "[" defined from package "base"
function (x, i, j, ..., drop = TRUE)
standardGeneric("[", .Primitive("["))
<bytecode: 0x42f4fe0>
<environment: 0x3214270>
Methods may be defined for arguments: x, i, j, drop
Use showMethods("[") for currently available ones.
Your method signature differs from the generic (no '...' and no default for 'drop') so the method has a nested '.local' function
> getMethod("[", "MySeries")
Method Definition:
function (x, i, j, ..., drop = TRUE)
{
.local <- function (x, i, j, drop)
{
ii <- substitute(i)
x$data <- x$data[eval(ii)]
return(x)
}
.local(x, i, j, ..., drop)
}
Signatures:
x
target "MySeries"
defined "MySeries"
and subsitute(i) is not what you think it is. Instead, write a method matching the generic signature
setMethod("[", "MySeries", function(x, i, j, ..., drop=TRUE) {
x$data <- x$data[eval(substitute(i))]
x
})
nested functions are a general problem with the eval(substitute()) paradigm, not just definition of S4 methods; see this question.