Multiple inheritance for R6 classes - r

Actual question
What are my options to workaround the fact that R6 does not support multiple inheritance?
Disclaimer
I know that R is primarily a functional language. However, it does also have very powerful object-orientation built in. Plus: I don't see what's wrong with mimicking OOD principles/behavior when you
know you're prototyping for an object-oriented language such as C#, Java, etc.
your prototypes of apps need to be self-sufficient ("full stack" including DB-backends, business logic and frontends/UI)
you have such great "prototyping technology" like R6 and shiny at your disposal
Context
My R prototypes for web apps need to be both "full stack"/self sufficient and as close as possible to design patterns/principles and dependency injection containers (proof of concept of simple DI in R) used in our production language (C#/.NET).
In that regard, I came to like the use of interfaces (or abstract classes) very much in order to decouple code modules and to comply with the D (dependency inversion principle) of the SOLID principles of OOD (detailed explanation by "Uncle Bob").
Even though R6 does not explicitly support interfaces, I can nevertheless perfectly mimick them with R6 classes that define nothing but "abstract methods" (see example below). This helps me a lot with communicating my software designs to our OO-programmers that aren't very familiar with R. I strive for as little "conceptional conversion effort" on their part.
However, I need to give up my value for inherit in R6Class for that which becomes a bit of a problem when I actually want to inherit from other concrete (as opposed to "abstract-like" mimicked interface classes) because this would mean to define not one but two classes in inherit.
Example
Before inversion of dependency:
Foo depends on concrete class Bar. From an OOD principles' view, this is pretty bad as it leads to code being tightly coupled.
Bar <- R6Class("Bar",
public = list(doSomething = function(n) private$x[1:n]),
private = list(x = letters)
)
Foo <- R6Class("Foo",
public = list(bar = Bar$new())
)
inst <- Foo$new()
> class(inst)
> class(inst$bar)
[1] "Bar" "R6"
After inversion of dependency:
Foo and Bar are decoupled now. Both depend on an interface which is mimicked by class IBar. I can decide which implementation of that interface I would like to plug in to instances of Foo at runtime (realized via Property Injection: field bar of Foo)
IBar <- R6Class("IBar",
public = list(doSomething = function(n = 1) stop("I'm the inferace method"))
)
Bar <- R6Class("Bar", inherit = IBar,
public = list(doSomething = function(n = 1) private$x[1:n]),
private = list(x = letters)
)
Baz <- R6Class("Baz", inherit = IBar,
public = list(doSomething = function(n = 1) private$x[1:n]),
private = list(x = 1:24)
)
Foo <- R6Class("Foo",
public = list(bar = IBar$new())
)
inst <- Foo$new()
inst$bar <- Bar$new()
> class(inst$bar)
[1] "Bar" "IBar" "R6"
> inst$bar$doSomething(5)
[1] "a" "b" "c" "d" "e"
inst$bar <- Baz$new()
[1] "Baz" "IBar" "R6"
> inst$bar$doSomething(5)
[1] 1 2 3 4 5
A bit mor on why this makes sense with regard to OOD: Foo should be completely agnostic of the the way the object stored in field bar is implemented. All it needs to know is which methods it can call on that object. And in order to know that, it's enough to know the interface that the object in field bar implements (IBar with method doSomething(), in our case).
Using inheritance from base classes to simplify design:
So far, so good. However, I'd also like to simplify my design by definining certain concrete base classes that some of my other concrete classes can inherit from.
BaseClass <- R6Class("BaseClass",
public = list(doSomething = function(n = 1) private$x[1:n])
)
Bar <- R6Class("Bar", inherit = BaseClass,
private = list(x = letters)
)
Baz <- R6Class("Bar", inherit = BaseClass,
private = list(x = 1:24)
)
inst <- Foo$new()
inst$bar <- Bar$new()
> class(inst$bar)
[1] "Bar" "BaseClass" "R6"
> inst$bar$doSomething(5)
[1] "a" "b" "c" "d" "e"
inst$bar <- Baz$new()
> class(inst$bar)
[1] "Baz" "BaseClass" "R6"
> inst$bar$doSomething(5)
[1] 1 2 3 4 5
Combining "interface implementation" and base clases inheritance:
This is where I would need multiple inheritance so something like this would work (PSEUDO CODE):
IBar <- R6Class("IBar",
public = list(doSomething = function() stop("I'm the inferace method"))
)
BaseClass <- R6Class("BaseClass",
public = list(doSomething = function(n = 1) private$x[1:n])
)
Bar <- R6Class("Bar", inherit = c(IBar, BaseClass),
private = list(x = letters)
)
inst <- Foo$new()
inst$bar <- Bar$new()
class(inst$bar)
[1] "Bar" "BaseClass" "IBar" "R6"
Currently, my value for inherit is already being used up "just" for mimicking an interface implementation and so I lose the "actual" benefits of inheritance for my actual concrete classes.
Alternative thought:
Alternatively, it would be great to explicitly support a differentiation between interface and concrete classes somehow. For example something like this
Bar <- R6Class("Bar", implement = IBar, inherit = BaseClass,
private = list(x = letters)
)

For those interested:
I gave it a second thought and realized that's it's not really multiple inheritance per se that I want/need, but rather some sort of better mimicking the use of interfaces/abstract classes without giving up inherit for that.
So I tried tweaking R6 a bit so it would allow me to distinguish between inherit and implement in a call to R6Class.
Probably tons of reasons why this is a bad idea, but for now, it gets the job done ;-)
You can install the tweaked version from my forked branch.
Example
devtools::install_github("rappster/R6", ref = "feat_interface")
library(R6)
Correct implementation of interface and "standard inheritance":
IFoo <- R6Class("IFoo",
public = list(foo = function() stop("I'm the inferace method"))
)
BaseClass <- R6Class("BaseClass",
public = list(foo = function(n = 1) private$x[1:n])
)
Foo <- R6Class("Foo", implement = IFoo, inherit = BaseClass,
private = list(x = letters)
)
> Foo$new()
<Foo>
Implements interface: <IFoo>
Inherits from: <BaseClass>
Public:
clone: function (deep = FALSE)
foo: function (n = 1)
Private:
x: a b c d e f g h i j k l m n o p q r s t u v w x y z
When an interface is not implemented correctly (i.e. method not implemented):
Bar <- R6Class("Bar", implement = IFoo,
private = list(x = letters)
)
> Bar$new()
Error in Bar$new() :
Non-implemented interface method: foo
Proof of concept for dependency injection
This is a little draft that elaborates a bit on the motivation and possible implementation approaches for interfaces and inversion of dependency in R6.

Plus: I don't see what's wrong with mimicking OOD principles/behavior when you know you're prototyping for an object-oriented language such as C#, Java, etc.
What’s wrong with it is that you needed to ask this question because R is simply an inadequate tool to prototype an OOD system, because it doesn’t support what you need.
Or just prototype those aspects of your solution which rely on data analysis, and don’t prototype those aspects of the API which don’t fit into the paradigm.
That said, the strength of R is that you can write your own object system; after all, that’s what R6 is. R6 just so happens to be inadequate for your purposes, but nothing stops you from implementing your own system. In particular, S3 already allows multiple inheritance, it just doesn’t support codified interfaces (instead, they happen ad-hoc).
But nothing stops you from providing a wrapper function that performs this codification. For instance, you could implement a set of functions interface and class (beware name clashes though) that can be used as follows:
interface(Printable,
print = prototype(x, ...))
interface(Comparable,
compare_to = prototype(x, y))
class(Foo,
implements = c(Printable, Comparable),
private = list(x = 1),
print = function (x, ...) base::print(x$x, ...),
compare_to = function (x, y) sign(x$x - y$x))
This would then generate (for instance):
print.Foo = function (x, ...) base::print(x$x, ...)
compare_to = function (x, y) UseMethod('compare_to')
compare_to.foo = function (x, y) sign(x$x - y$x)
Foo = function ()
structure(list(x = 1), class = c('Foo', 'Printable', 'Comparable'))
… and so on. In fact, S4 does something similar (but badly, in my opinion).

Related

R: Overloading primitive, non-generic functions

I'd like to find the minimal hack to be able to say module::obj when module is not a package but a list or environment.
After some digging, I see the following works for the new use case, but breaks the native one:
module = structure(list(f = \(x) x + 1), class = "module_cls")
`::` = function(mod, key) UseMethod("::")
`::.default` = function(mod, key) .Primitive("::")
`::.module_cls` = function(mod, key) mod[[as.character(substitute(key))]]
module::f(1) # works!
base::sum(1, 1) # Error in base::sum : object 'base' not found
The problem seems to be either in the definition of the default method
or how anything that is not module_cls is depatched to default.

Initialize R6 class with an instance of Class and return the same Class

Given an R6 class Class1 with its initialize function, I would like to be able to pass an instance of the same class and return it directly.
Something like this:
if("Class1" %in% class(x)) x
else Class1$new(x)
But inside the initialize function of an R6 class, that should work like this
# This script does not work
# Definition of the class
Class1 = R6::R6Class("Class1",
public=list(
initialize = function(x){
# This line does not work
if("Class1"%in%class(x)) return(x)
}
))
# Initiate an instance from scratch
foo = Class1$new()
# Initiate an instance through another instance
bar = Class1$new(foo)
# The two should be the same
identical(foo, bar)
#> TRUE
Under current state of R6 this seems to not be possible. Check the raw code github.com/r-lib/R6/blob/master/R/new.R . Most important are lines 154 where is initialization applied and 194 where public_bind_env is returned . The problem is that even with super assignment I think we could not overwrite it as all things are built here from a new empty env with own address.
This solution which using a wrapper is widthly used in the market and it is doing what it should:
class1 <- function(x = NULL) {
if(inherits(x, "Class1")) x else Class1$new(x)
}
Class1 = R6::R6Class("Class1",
public=list(
initialize = function(x = NULL){
}
))
# Initiate an instance from scratch
foo = class1()
# Initiate an instance through another instance
bar = class1(foo)
# The two should be the same
identical(foo, bar)
#> TRUE

How to use `foreach` and `%dopar%` with an `R6` class in R?

I ran into an issue trying to use %dopar% and foreach() together with an R6 class. Searching around, I could only find two resources related to this, an unanswered SO question and an open GitHub issue on the R6 repository.
In one comment (i.e., GitHub issue) an workaround is suggested by reassigning the parent_env of the class as SomeClass$parent_env <- environment(). I would like to understand what exactly does environment() refer to when this expression (i.e., SomeClass$parent_env <- environment()) is called within the %dopar% of foreach?
Here is a minimal reproducible example:
Work <- R6::R6Class("Work",
public = list(
values = NULL,
initialize = function() {
self$values <- "some values"
}
)
)
Now, the following Task class uses the Work class in the constructor.
Task <- R6::R6Class("Task",
private = list(
..work = NULL
),
public = list(
initialize = function(time) {
private$..work <- Work$new()
Sys.sleep(time)
}
),
active = list(
work = function() {
return(private$..work)
}
)
)
In the Factory class, the Task class is created and the foreach is implemented in ..m.thread().
Factory<- R6::R6Class("Factory",
private = list(
..warehouse = list(),
..amount = NULL,
..parallel = NULL,
..m.thread = function(object, ...) {
cluster <- parallel::makeCluster(parallel::detectCores() - 1)
doParallel::registerDoParallel(cluster)
private$..warehouse <- foreach::foreach(1:private$..amount, .export = c("Work")) %dopar% {
# What exactly does `environment()` encapsulate in this context?
object$parent_env <- environment()
object$new(...)
}
parallel::stopCluster(cluster)
},
..s.thread = function(object, ...) {
for (i in 1:private$..amount) {
private$..warehouse[[i]] <- object$new(...)
}
},
..run = function(object, ...) {
if(private$..parallel) {
private$..m.thread(object, ...)
} else {
private$..s.thread(object, ...)
}
}
),
public = list(
initialize = function(object, ..., amount = 10, parallel = FALSE) {
private$..amount = amount
private$..parallel = parallel
private$..run(object, ...)
}
),
active = list(
warehouse = function() {
return(private$..warehouse)
}
)
)
Then, it is called as:
library(foreach)
x = Factory$new(Task, time = 2, amount = 10, parallel = TRUE)
Without the following line object$parent_env <- environment(), it throws an error (i.e., as mentioned in the other two links): Error in { : task 1 failed - "object 'Work' not found".
I would like to know, (1) what are some potential pitfalls when assigning the parent_env inside foreach and (2) why does it work in the first place?
Update 1:
I returned environment() from within foreach(), such that private$..warehouse captures those environments
using rlang::env_print() in a debug session (i.e., the browser() statement was placed right after foreach has ended execution) here is what they consist of:
Browse[1]> env_print(private$..warehouse[[1]])
# <environment: 000000001A8332F0>
# parent: <environment: global>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * ...: <...>
Browse[1]> env_print(environment())
# <environment: 000000001AC0F890>
# parent: <environment: 000000001AC20AF0>
# bindings:
# * private: <env>
# * cluster: <S3: SOCKcluster>
# * ...: <...>
Browse[1]> env_print(parent.env(environment()))
# <environment: 000000001AC20AF0>
# parent: <environment: global>
# bindings:
# * private: <env>
# * self: <S3: Factory>
Browse[1]> env_print(parent.env(parent.env(environment())))
# <environment: global>
# parent: <environment: package:rlang>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * .Random.seed: <int>
# * Factory: <S3: R6ClassGenerator>
# * Task: <S3: R6ClassGenerator>
Disclaimer: a lot of what I say here are educated guesses and inferences based on what I know,
I can't guarantee everything is 100% correct.
I think there can be many pitfalls,
and which one applies really depends on what you do.
I think your second question is more important,
because if you understand that,
you'll be able to evaluate some of the pitfalls by yourself.
The topic is rather complex,
but you can probably start by reading about R's lexical scoping.
In essence, R has a sort of hierarchy of environments,
and when R code is executed,
variables whose values are not found in the current environment
(which is what environment() returns)
are sought in the parent environments
(not to be confused with the caller environments).
Based on the GitHub issue you linked,
R6 generators save a "reference" to their parent environments,
and they expect that everything their classes may need can be found in said parent or somewhere along the environment hierarchy,
starting at that parent and going "up".
The reason the workaround you're using works is because you're replacing the generator's parent environment with the one in the current foreach call inside the parallel worker
(which may be a different R process, not necessarily a different thread),
and, given your .export specification probably exports necessary values,
R's lexical scoping can then search for missing values starting from the foreach call in the separate thread/process.
For the specific example you linked,
I found that a simpler way to make it work
(at least on my Linux machine)
is to do the following:
library(doParallel)
cluster <- parallel::makeCluster(parallel::detectCores() - 1)
doParallel::registerDoParallel(cluster)
parallel::clusterExport(cluster, setdiff(ls(), "cluster"))
x = Factory$new(Task, time = 1, amount = 3)
but leaving the ..m.thread function as:
..m.thread = function(object, amount, ...) {
private$..warehouse <- foreach::foreach(1:amount) %dopar% {
object$new(...)
}
}
(and manually call stopCluster when done).
The clusterExport call should have semantics similar to*:
take everything from the main R process' global environment except cluster,
and make it available in each parallel worker's global environment.
That way, any code inside the foreach call can use the generators when lexical scoping reaches their respective global environments.
foreach can be clever and exports some variables automatically
(as shown in the GitHub issue),
but it has limitations,
and the hierarchy used during lexical scoping can get very messy.
*I say "similar to" because I don't know what exactly R does to distinguish (global) environments if forks are used,
but since that export is needed,
I assume they are indeed independent of each other.
PS: I'd use a call to on.exit(parallel::stopCluster(cluster)) if you create workers inside a function call,
that way you avoid leaving processes around until they are somehow stopped if an error occurs.

Using oop in R, unable to understand the concept

Its my first try to using OOP in R and it's difficult for me to understand the main concept. For example, what are these:
slot, setGeneric, representation
I was unable to find anything helpful by searching the internet. How do these work in R? For example, I have the following MATLAB class:
classdef windTurbine < handle
properties
NumOfBlades
blade#blade
sweptArea
end
methods
function obj = windTurbine(NumOfBlades,blade)
obj.NumOfBlades = NumOfBlades;
obj.blade = blade;
obj.sweptArea = CalcSweptArea(obj);
end
sweptArea = CalcSweptArea(obj)
end
How do I write this in R? How do I add calculations to the constructor? Make functions private? And mainly use the consept of OOP in R. An example would be helpfull, or a nice tutorial explanation
In addition to http://adv-r.had.co.nz/OO-essentials.html, which presents R objects as in base and recommended packages, you have also R6, which is much closer to what you are doing in Matlab. Your example translates like this:
# Need to install R6 first:
# install.packages("R6")
library(R6)
windTurbine <- R6Class("windTurbine",
public = list(
# Properties (fields)
NumOfBlades = integer(0),
blade = NULL, # Which kind of object is it?
sweptArea = numeric(0),
# Methods
initialize = function(NumOfBlades, blade) {
self$NumOfBlades <- as.integer(NumOfBlades)
self$blade <- blade
self$sweptArea <- self$CalcSweptArea()
},
CalcSweptArea = function() {
# < your code here>
# (Return a fake value, just for testing)
return(10)
}
))
wt <- windTurbine$new(NumOfBlades = 6, blade = 3)
wt$sweptArea
Look at ?R6Class(). You have also a private = argument for private fields or methods.

Order of methods in R reference class and multiple files

There is one thing I really don't like about R reference class: the order you write the methods matters. Suppose your class goes like this:
myclass = setRefClass("myclass",
fields = list(
x = "numeric",
y = "numeric"
))
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
bfunc(i)
}
)
myclass$methods(
bfunc = function(i) {
message("In bfunc, I just call cfunc...")
cfunc(i)
}
)
myclass$methods(
cfunc = function(i) {
message("In cfunc, I print out the sum of i, x and y...")
message(paste("i + x + y = ", i+x+y))
}
)
myclass$methods(
initialize = function(x, y) {
x <<- x
y <<- y
}
)
And then you start an instance, and call a method:
x = myclass(5, 6)
x$afunc(1)
You will get an error:
Error in x$afunc(1) : could not find function "bfunc"
I am interested in two things:
Is there a way to work around this nuisance?
Does this mean I can never split a really long class file into multiple files? (e.g. one file for each method.)
Calling bfunc(i) isn't going to invoke the method since it doesn't know what object it is operating on!
In your method definitions, .self is the object being methodded on (?). So change your code to:
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
.self$bfunc(i)
}
)
(and similarly for bfunc). Are you coming from C++ or some language where functions within methods are automatically invoked within the object's context?
Some languages make this more explicit, for example in Python a method with one argument like yours actually has two arguments when defined, and would be:
def afunc(self, i):
[code]
but called like:
x.afunc(1)
then within the afunc there is the self variable which referes to x (although calling it self is a universal convention, it could be called anything).
In R, the .self is a little bit of magic sprinkled over reference classes. I don't think you could change it to .this even if you wanted.

Resources