Using oop in R, unable to understand the concept - r

Its my first try to using OOP in R and it's difficult for me to understand the main concept. For example, what are these:
slot, setGeneric, representation
I was unable to find anything helpful by searching the internet. How do these work in R? For example, I have the following MATLAB class:
classdef windTurbine < handle
properties
NumOfBlades
blade#blade
sweptArea
end
methods
function obj = windTurbine(NumOfBlades,blade)
obj.NumOfBlades = NumOfBlades;
obj.blade = blade;
obj.sweptArea = CalcSweptArea(obj);
end
sweptArea = CalcSweptArea(obj)
end
How do I write this in R? How do I add calculations to the constructor? Make functions private? And mainly use the consept of OOP in R. An example would be helpfull, or a nice tutorial explanation

In addition to http://adv-r.had.co.nz/OO-essentials.html, which presents R objects as in base and recommended packages, you have also R6, which is much closer to what you are doing in Matlab. Your example translates like this:
# Need to install R6 first:
# install.packages("R6")
library(R6)
windTurbine <- R6Class("windTurbine",
public = list(
# Properties (fields)
NumOfBlades = integer(0),
blade = NULL, # Which kind of object is it?
sweptArea = numeric(0),
# Methods
initialize = function(NumOfBlades, blade) {
self$NumOfBlades <- as.integer(NumOfBlades)
self$blade <- blade
self$sweptArea <- self$CalcSweptArea()
},
CalcSweptArea = function() {
# < your code here>
# (Return a fake value, just for testing)
return(10)
}
))
wt <- windTurbine$new(NumOfBlades = 6, blade = 3)
wt$sweptArea
Look at ?R6Class(). You have also a private = argument for private fields or methods.

Related

R: Overloading primitive, non-generic functions

I'd like to find the minimal hack to be able to say module::obj when module is not a package but a list or environment.
After some digging, I see the following works for the new use case, but breaks the native one:
module = structure(list(f = \(x) x + 1), class = "module_cls")
`::` = function(mod, key) UseMethod("::")
`::.default` = function(mod, key) .Primitive("::")
`::.module_cls` = function(mod, key) mod[[as.character(substitute(key))]]
module::f(1) # works!
base::sum(1, 1) # Error in base::sum : object 'base' not found
The problem seems to be either in the definition of the default method
or how anything that is not module_cls is depatched to default.

How to use `foreach` and `%dopar%` with an `R6` class in R?

I ran into an issue trying to use %dopar% and foreach() together with an R6 class. Searching around, I could only find two resources related to this, an unanswered SO question and an open GitHub issue on the R6 repository.
In one comment (i.e., GitHub issue) an workaround is suggested by reassigning the parent_env of the class as SomeClass$parent_env <- environment(). I would like to understand what exactly does environment() refer to when this expression (i.e., SomeClass$parent_env <- environment()) is called within the %dopar% of foreach?
Here is a minimal reproducible example:
Work <- R6::R6Class("Work",
public = list(
values = NULL,
initialize = function() {
self$values <- "some values"
}
)
)
Now, the following Task class uses the Work class in the constructor.
Task <- R6::R6Class("Task",
private = list(
..work = NULL
),
public = list(
initialize = function(time) {
private$..work <- Work$new()
Sys.sleep(time)
}
),
active = list(
work = function() {
return(private$..work)
}
)
)
In the Factory class, the Task class is created and the foreach is implemented in ..m.thread().
Factory<- R6::R6Class("Factory",
private = list(
..warehouse = list(),
..amount = NULL,
..parallel = NULL,
..m.thread = function(object, ...) {
cluster <- parallel::makeCluster(parallel::detectCores() - 1)
doParallel::registerDoParallel(cluster)
private$..warehouse <- foreach::foreach(1:private$..amount, .export = c("Work")) %dopar% {
# What exactly does `environment()` encapsulate in this context?
object$parent_env <- environment()
object$new(...)
}
parallel::stopCluster(cluster)
},
..s.thread = function(object, ...) {
for (i in 1:private$..amount) {
private$..warehouse[[i]] <- object$new(...)
}
},
..run = function(object, ...) {
if(private$..parallel) {
private$..m.thread(object, ...)
} else {
private$..s.thread(object, ...)
}
}
),
public = list(
initialize = function(object, ..., amount = 10, parallel = FALSE) {
private$..amount = amount
private$..parallel = parallel
private$..run(object, ...)
}
),
active = list(
warehouse = function() {
return(private$..warehouse)
}
)
)
Then, it is called as:
library(foreach)
x = Factory$new(Task, time = 2, amount = 10, parallel = TRUE)
Without the following line object$parent_env <- environment(), it throws an error (i.e., as mentioned in the other two links): Error in { : task 1 failed - "object 'Work' not found".
I would like to know, (1) what are some potential pitfalls when assigning the parent_env inside foreach and (2) why does it work in the first place?
Update 1:
I returned environment() from within foreach(), such that private$..warehouse captures those environments
using rlang::env_print() in a debug session (i.e., the browser() statement was placed right after foreach has ended execution) here is what they consist of:
Browse[1]> env_print(private$..warehouse[[1]])
# <environment: 000000001A8332F0>
# parent: <environment: global>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * ...: <...>
Browse[1]> env_print(environment())
# <environment: 000000001AC0F890>
# parent: <environment: 000000001AC20AF0>
# bindings:
# * private: <env>
# * cluster: <S3: SOCKcluster>
# * ...: <...>
Browse[1]> env_print(parent.env(environment()))
# <environment: 000000001AC20AF0>
# parent: <environment: global>
# bindings:
# * private: <env>
# * self: <S3: Factory>
Browse[1]> env_print(parent.env(parent.env(environment())))
# <environment: global>
# parent: <environment: package:rlang>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * .Random.seed: <int>
# * Factory: <S3: R6ClassGenerator>
# * Task: <S3: R6ClassGenerator>
Disclaimer: a lot of what I say here are educated guesses and inferences based on what I know,
I can't guarantee everything is 100% correct.
I think there can be many pitfalls,
and which one applies really depends on what you do.
I think your second question is more important,
because if you understand that,
you'll be able to evaluate some of the pitfalls by yourself.
The topic is rather complex,
but you can probably start by reading about R's lexical scoping.
In essence, R has a sort of hierarchy of environments,
and when R code is executed,
variables whose values are not found in the current environment
(which is what environment() returns)
are sought in the parent environments
(not to be confused with the caller environments).
Based on the GitHub issue you linked,
R6 generators save a "reference" to their parent environments,
and they expect that everything their classes may need can be found in said parent or somewhere along the environment hierarchy,
starting at that parent and going "up".
The reason the workaround you're using works is because you're replacing the generator's parent environment with the one in the current foreach call inside the parallel worker
(which may be a different R process, not necessarily a different thread),
and, given your .export specification probably exports necessary values,
R's lexical scoping can then search for missing values starting from the foreach call in the separate thread/process.
For the specific example you linked,
I found that a simpler way to make it work
(at least on my Linux machine)
is to do the following:
library(doParallel)
cluster <- parallel::makeCluster(parallel::detectCores() - 1)
doParallel::registerDoParallel(cluster)
parallel::clusterExport(cluster, setdiff(ls(), "cluster"))
x = Factory$new(Task, time = 1, amount = 3)
but leaving the ..m.thread function as:
..m.thread = function(object, amount, ...) {
private$..warehouse <- foreach::foreach(1:amount) %dopar% {
object$new(...)
}
}
(and manually call stopCluster when done).
The clusterExport call should have semantics similar to*:
take everything from the main R process' global environment except cluster,
and make it available in each parallel worker's global environment.
That way, any code inside the foreach call can use the generators when lexical scoping reaches their respective global environments.
foreach can be clever and exports some variables automatically
(as shown in the GitHub issue),
but it has limitations,
and the hierarchy used during lexical scoping can get very messy.
*I say "similar to" because I don't know what exactly R does to distinguish (global) environments if forks are used,
but since that export is needed,
I assume they are indeed independent of each other.
PS: I'd use a call to on.exit(parallel::stopCluster(cluster)) if you create workers inside a function call,
that way you avoid leaving processes around until they are somehow stopped if an error occurs.

Multiple inheritance for R6 classes

Actual question
What are my options to workaround the fact that R6 does not support multiple inheritance?
Disclaimer
I know that R is primarily a functional language. However, it does also have very powerful object-orientation built in. Plus: I don't see what's wrong with mimicking OOD principles/behavior when you
know you're prototyping for an object-oriented language such as C#, Java, etc.
your prototypes of apps need to be self-sufficient ("full stack" including DB-backends, business logic and frontends/UI)
you have such great "prototyping technology" like R6 and shiny at your disposal
Context
My R prototypes for web apps need to be both "full stack"/self sufficient and as close as possible to design patterns/principles and dependency injection containers (proof of concept of simple DI in R) used in our production language (C#/.NET).
In that regard, I came to like the use of interfaces (or abstract classes) very much in order to decouple code modules and to comply with the D (dependency inversion principle) of the SOLID principles of OOD (detailed explanation by "Uncle Bob").
Even though R6 does not explicitly support interfaces, I can nevertheless perfectly mimick them with R6 classes that define nothing but "abstract methods" (see example below). This helps me a lot with communicating my software designs to our OO-programmers that aren't very familiar with R. I strive for as little "conceptional conversion effort" on their part.
However, I need to give up my value for inherit in R6Class for that which becomes a bit of a problem when I actually want to inherit from other concrete (as opposed to "abstract-like" mimicked interface classes) because this would mean to define not one but two classes in inherit.
Example
Before inversion of dependency:
Foo depends on concrete class Bar. From an OOD principles' view, this is pretty bad as it leads to code being tightly coupled.
Bar <- R6Class("Bar",
public = list(doSomething = function(n) private$x[1:n]),
private = list(x = letters)
)
Foo <- R6Class("Foo",
public = list(bar = Bar$new())
)
inst <- Foo$new()
> class(inst)
> class(inst$bar)
[1] "Bar" "R6"
After inversion of dependency:
Foo and Bar are decoupled now. Both depend on an interface which is mimicked by class IBar. I can decide which implementation of that interface I would like to plug in to instances of Foo at runtime (realized via Property Injection: field bar of Foo)
IBar <- R6Class("IBar",
public = list(doSomething = function(n = 1) stop("I'm the inferace method"))
)
Bar <- R6Class("Bar", inherit = IBar,
public = list(doSomething = function(n = 1) private$x[1:n]),
private = list(x = letters)
)
Baz <- R6Class("Baz", inherit = IBar,
public = list(doSomething = function(n = 1) private$x[1:n]),
private = list(x = 1:24)
)
Foo <- R6Class("Foo",
public = list(bar = IBar$new())
)
inst <- Foo$new()
inst$bar <- Bar$new()
> class(inst$bar)
[1] "Bar" "IBar" "R6"
> inst$bar$doSomething(5)
[1] "a" "b" "c" "d" "e"
inst$bar <- Baz$new()
[1] "Baz" "IBar" "R6"
> inst$bar$doSomething(5)
[1] 1 2 3 4 5
A bit mor on why this makes sense with regard to OOD: Foo should be completely agnostic of the the way the object stored in field bar is implemented. All it needs to know is which methods it can call on that object. And in order to know that, it's enough to know the interface that the object in field bar implements (IBar with method doSomething(), in our case).
Using inheritance from base classes to simplify design:
So far, so good. However, I'd also like to simplify my design by definining certain concrete base classes that some of my other concrete classes can inherit from.
BaseClass <- R6Class("BaseClass",
public = list(doSomething = function(n = 1) private$x[1:n])
)
Bar <- R6Class("Bar", inherit = BaseClass,
private = list(x = letters)
)
Baz <- R6Class("Bar", inherit = BaseClass,
private = list(x = 1:24)
)
inst <- Foo$new()
inst$bar <- Bar$new()
> class(inst$bar)
[1] "Bar" "BaseClass" "R6"
> inst$bar$doSomething(5)
[1] "a" "b" "c" "d" "e"
inst$bar <- Baz$new()
> class(inst$bar)
[1] "Baz" "BaseClass" "R6"
> inst$bar$doSomething(5)
[1] 1 2 3 4 5
Combining "interface implementation" and base clases inheritance:
This is where I would need multiple inheritance so something like this would work (PSEUDO CODE):
IBar <- R6Class("IBar",
public = list(doSomething = function() stop("I'm the inferace method"))
)
BaseClass <- R6Class("BaseClass",
public = list(doSomething = function(n = 1) private$x[1:n])
)
Bar <- R6Class("Bar", inherit = c(IBar, BaseClass),
private = list(x = letters)
)
inst <- Foo$new()
inst$bar <- Bar$new()
class(inst$bar)
[1] "Bar" "BaseClass" "IBar" "R6"
Currently, my value for inherit is already being used up "just" for mimicking an interface implementation and so I lose the "actual" benefits of inheritance for my actual concrete classes.
Alternative thought:
Alternatively, it would be great to explicitly support a differentiation between interface and concrete classes somehow. For example something like this
Bar <- R6Class("Bar", implement = IBar, inherit = BaseClass,
private = list(x = letters)
)
For those interested:
I gave it a second thought and realized that's it's not really multiple inheritance per se that I want/need, but rather some sort of better mimicking the use of interfaces/abstract classes without giving up inherit for that.
So I tried tweaking R6 a bit so it would allow me to distinguish between inherit and implement in a call to R6Class.
Probably tons of reasons why this is a bad idea, but for now, it gets the job done ;-)
You can install the tweaked version from my forked branch.
Example
devtools::install_github("rappster/R6", ref = "feat_interface")
library(R6)
Correct implementation of interface and "standard inheritance":
IFoo <- R6Class("IFoo",
public = list(foo = function() stop("I'm the inferace method"))
)
BaseClass <- R6Class("BaseClass",
public = list(foo = function(n = 1) private$x[1:n])
)
Foo <- R6Class("Foo", implement = IFoo, inherit = BaseClass,
private = list(x = letters)
)
> Foo$new()
<Foo>
Implements interface: <IFoo>
Inherits from: <BaseClass>
Public:
clone: function (deep = FALSE)
foo: function (n = 1)
Private:
x: a b c d e f g h i j k l m n o p q r s t u v w x y z
When an interface is not implemented correctly (i.e. method not implemented):
Bar <- R6Class("Bar", implement = IFoo,
private = list(x = letters)
)
> Bar$new()
Error in Bar$new() :
Non-implemented interface method: foo
Proof of concept for dependency injection
This is a little draft that elaborates a bit on the motivation and possible implementation approaches for interfaces and inversion of dependency in R6.
Plus: I don't see what's wrong with mimicking OOD principles/behavior when you know you're prototyping for an object-oriented language such as C#, Java, etc.
What’s wrong with it is that you needed to ask this question because R is simply an inadequate tool to prototype an OOD system, because it doesn’t support what you need.
Or just prototype those aspects of your solution which rely on data analysis, and don’t prototype those aspects of the API which don’t fit into the paradigm.
That said, the strength of R is that you can write your own object system; after all, that’s what R6 is. R6 just so happens to be inadequate for your purposes, but nothing stops you from implementing your own system. In particular, S3 already allows multiple inheritance, it just doesn’t support codified interfaces (instead, they happen ad-hoc).
But nothing stops you from providing a wrapper function that performs this codification. For instance, you could implement a set of functions interface and class (beware name clashes though) that can be used as follows:
interface(Printable,
print = prototype(x, ...))
interface(Comparable,
compare_to = prototype(x, y))
class(Foo,
implements = c(Printable, Comparable),
private = list(x = 1),
print = function (x, ...) base::print(x$x, ...),
compare_to = function (x, y) sign(x$x - y$x))
This would then generate (for instance):
print.Foo = function (x, ...) base::print(x$x, ...)
compare_to = function (x, y) UseMethod('compare_to')
compare_to.foo = function (x, y) sign(x$x - y$x)
Foo = function ()
structure(list(x = 1), class = c('Foo', 'Printable', 'Comparable'))
… and so on. In fact, S4 does something similar (but badly, in my opinion).

How would you index a table that is being initialized?

An example of what I desire:
local X = {["Alpha"] = 5, ["Beta"] = this.Alpha+3}
print(X.Beta) --> error: [string "stdin"]:1: attempt to index global 'this' (a nil value)
is there a way to get this working, or a substitute I can use without too much code bloat(I want it to look presentable, so fenv hacks are out of the picture)
if anyone wants to take a crack at lua, repl.it is a good testing webpage for quick scripts
No there is no way to do this because the table does not yet exist and there is no notion of "self" in Lua (except via syntactic sugar for table methods). You have to do it in two steps:
local X = {["Alpha"] = 5}
X["Beta"] = X.Alpha+3
Note that you only need the square brackets if your key is not a string or if it is a string with characters other than any of [a-z][A-Z][0-9]_.
local X = {Alpha = 5}
X.Beta = X.Alpha+3
Update:
Based on what I saw on your pastebin, you probably should do this slightly differently:
local Alpha = 5
local X = {
Alpha = Alpha,
Beta = Alpha+3,
Gamma = someFunction(Alpha),
Eta = Alpha:method()
}
(obviously Alpha has no method because in the example it is a number but you get the idea, just wanted to show if Alpha were an object).

Modify contents of object with "call by reference"

I am trying to modify the contents of an object defined by a self-written class with a function that takes two objects of this class and adds the contents.
setClass("test",representation(val="numeric"),prototype(val=1))
I know that R not really works with "call by reference" but can mimic that behaviour with a method like this one:
setGeneric("value<-", function(test,value) standardGeneric("value<-"))
setReplaceMethod("value",signature = c("test","numeric"),
definition=function(test,value) {
test#val <- value
test
})
foo = new("test") #foo#val is 1 per prototype
value(foo)<-2 #foo#val is now set to 2
Until here, anything I did and got as result is consitent with my research here on stackexchange,
Call by reference in R (using function to modify an object)
and with this code from a lecture (commented and written in German)
What I wish to achieve now is a similar result with the following method:
setGeneric("add<-", function(testA,testB) standardGeneric("add<-"))
setReplaceMethod("add",signature = c("test","test"),
definition=function(testA,testB) {
testA#val <- testA#val + testB#val
testA
})
bar = new("test")
add(foo)<-bar #should add the value slot of both objects and save the result to foo
Instead I get the following error:
Error in `add<-`(`*tmp*`, value = <S4 object of class "test">) :
unused argument (value = <S4 object of class "test">)
The function call works with:
"add<-"(foo,bar)
But this does not save the value into foo. Using
foo <- "add<-"(foo,bar)
#or using
setMethod("add",signature = c("test","test"), definition= #as above... )
foo <- add(foo,bar)
works but this is inconsistent with the modifying method value(foo)<-2
I have the feeling that I am missing something simple here.
Any help is very much appreciated!
I do not remember why, but for <- functions, the last argument must be named 'value'.
So in your case:
setGeneric("add<-", function(testA,value) standardGeneric("add<-"))
setReplaceMethod("add",signature = c("test","test"),
definition=function(testA,value) {
testA#val <- testA#val + value#val
testA
})
bar = new("test")
add(foo)<-bar
You may also use a Reference class ig you want to avoid the traditional arguments as values thing.

Resources