How to use S4 object programming in R - r

What's wrong with my R script? I'm trying to use a vector of user-defined objects (here a vector of "Page" objects) within another user-defined object (here a "Book" object)
setClass("Page",
slots = c(PageNo = "numeric", #scalar
Contents = "character") #vector of strings
)
setClass("Book",
slots = c(Pages = "vector", # Something wrong here? vector of pages ? "Page" or vector" or "list"
Title = "character") #vector of strings
)
setGeneric(name="AddPage", def=function(aBook, pageNo){standardGeneric("AddPage")})
setMethod(f="AddPage", signature="Book",
definition=function(aBook, pageNo)
{
page1 = new("Page")
page1#PageNo = pageNo
aBook#Pages = c(aBook#Pages, page1) # Something wrong here?
}
)
book1 = new("Book")
book1#Title = "Sample Book"
book1
book1#Pages
AddPage(book1, 1)
AddPage(book1, 2)
book1#Pages

Remember that R does not use reference semantics, so AddPage(book1, 1) creates a copy of book1, and updates that. In the method you don't return the updated object, and book1 remains unchanged.
Update the method so that it returns the modified object
setMethod(f="AddPage", signature="Book",
definition=function(aBook, pageNo)
{
page1 = new("Page")
page1#PageNo = pageNo
aBook#Pages = c(aBook#Pages, page1) # Something wrong here?
aBook
}
)
and assign the return value to the old variable
book1 = AddPage(book1, 1)
But this is a very inefficient approach -- the line aBook#Pages = c(aBook#Pages, page1) makes a copy of all existing pages (on the right-hand side, to create a longer vector; this will scale with the square of the number of Pages added to the book) and then copies the entire Book (for the assignment). In addition, creating individual objects is expensive and does not exploit R's 'vectorization'. A first step is to think of the object 'Page' as instead 'Pages', where the object models the columns rather than rows of a data frame. 'Book' then doesn't have vector of Page objects, but a single Pages object. This also implies a different approach to creating your 'book'.

Related

Preallocating a dict of dicts

When I run #code_warntype on the following function (Shown in bold are the expressions that are likely raising the red flags.)
function cardata(df::DataFrame,Emission_val::Float64,search_cars::Dict{String,Tuple{Int64,Int64}}=Dict("Car1" => (1000,10000), "Car2" => (1000,50000), "Car3" => (1000,6000)),
all_cars::Array{String,1}=["Car1","Car2","Car3","Car4","Car5","Car6"])
**species = Dict()**
# The data file containing car information of interest
car_library = joinpath(path,"cars.csv")
df_car_data=CSV.read(car_library,header=["Model","Velocity","Emission_Value","Mass","Column6"],delim='\t')
#delete unused column
deletecols!(df_car_data, :Column6)
#create a new column with only the car Identifier name
df_car_data[:Identifier_car]=[split(i,r"[0-9]+")[1] for i in df_car_data[:Model]]
#get the properties of all_cars from the cars_data table
for search_models in all_cars
**cars[search_models] = Dict()**
for i in 1:1:length(df_cars_data[1])
num = split(df_cars_data[:Model][i],r"[0-9]+")[1]
alpha = split(df_cars_data[:Model][i],r"[a-zA-Z]+")[2]
if ( num == search_models )
species[num][alpha] = df_car_data[:Velocity][i]
end
end
end
end
I get the following warning highlighted in red:
Body::Tuple{Dict{Any,Any},Union{DataFrame,DataFrameRow{DataFrame,Index}},Any,Any}.
How to preallocate the types for dicts in such a case, assuming that I know the length of data that will populate the dict?
You have not provided a minimal working example.
Have a look at the code below. Note that for efficiency reasons
it is recommended to use Symbol a the key rather than String
species = Dict{Symbol,Dict{Symbol,Float64}}()
group = get!(()->Dict{Symbol,Float64}(),species,Symbol("audi"))
group[Symbol("a4")]=10.5
group[Symbol("a6")]=9.5
And now printing the output:
julia> println(species)
Dict(:audi=>Dict(:a6=>9.5,:a4=>10.5))

filling an array recursively in R language

I have a multidimensional array (B_matrix) that I need to fill up with some random values. Since the dimension depends on two parameters K and C that are user defined, I cannot use nested loop to fill the array, so I have decided to fill it up recursively.
The problem with the recursion function (fillUp) is that that even though the array is declared outside the function, the array is set to NULL after the function is run.
B_dim = rep(2,((K+1+C)*2))
B_matrix = array( dim = B_dim, dimnames = NULL)
string = c()
fillUp<-function(level, string ){
if (level>=1){
for(i in c(1,2)){
Recall(level-1, c(string, i))
}
}else{
B_matrix[string] = 1;
}
}
fillUp(length(B_dim), string)
> sum( B_matrix == 1)
[1] NA
I'm new to R, so I'm not sure if the "global" declaration allows fillUp to change the values of the matrix.
Edit:
Note that the line
B_matrix[string] = 1;
is just a test case, and the original idea is to assign some random value that depends of the position of the array element.
Edit2:
Based on what #Bridgeburners hinted, I'm almost there. Replacing B_matrix[string] = 1, by
assign('str', matrix(string,1), envir=.GlobalEnv)
assign('hl', B_half_length, envir=.GlobalEnv)
rul <-runif(1, 0, sum(str[1:hl]))
with( .GlobalEnv,B_matrix[str] <- rul)
I get the error (last line):
Error in eval(expr, envir, enclos) : object 'rul' not found
The problem, I guess, is that I'm working with variables from two different environments at the same time. I don't know how to proceed here.
This option doesn't work either
assign('str',matrix(string,1), envir=.GlobalEnv)
assign('hl', B_half_length, envir=.GlobalEnv)
assign('ru', runif(1, 0, sum(str[1:hl])), envir=.GlobalEnv)
with( .GlobalEnv,B_matrix[str] <- ru)
Note: no visible binding for global variable 'ru'
Edit3:
I've finally solved it:
assign('str',matrix(string,1), envir=.GlobalEnv)
with( .GlobalEnv, B_matrix[str] <- runif(1, 0, sum(str[1:B_half_length])-B_half_length+1) )
where B_half_length is a global variable
Whenever a process is working within a function, it's working in a different environment. The object "B_matrix" is defined in the global environment. Since you're nesting environments (2*(K+C+1) times) you're not impacting the original object. If you simply replace line
B_matrix[string] = 1;
with
assign('str', matrix(string,1), envir=.GlobalEnv)
with(.GlobalEnv,B_matrix[str] <- 1)
your code will work. You simply need to specify which environment your expression is working in. (In the first line you're passing the local value of "string" to a global object named "str".)
Note, also, that indexing an array with a vector doesn't work.
That is, "B_matrix[2,2,2,2,2,2]" is not the same as "B_matrix[c(2,2,2,2,2,2)]".
But it works with a matrix
What you want can be achieved with the following line code once you have initialised you B_matrix array:
B_matrix[] <- runif(length(B_matrix))

Partial matching confusion when arguments passed through dots ('...')

I've been working on an R package that is just a REST API wrapper for a graph database. I have a function createNode that returns an object with class node and entity:
# Connect to the db.
graph = startGraph("http://localhost:7474/db/data/")
# Create two nodes in the db.
alice = createNode(graph, name = "Alice")
bob = createNode(graph, name = "Bob")
> class(alice)
[1] "node" "entity"
> class(bob)
[1] "node" "entity"
I have another function, createRel, that creates a relationship between two nodes in the database. It is specified as follows:
createRel = function(fromNode, type, toNode, ...) {
UseMethod("createRel")
}
createRel.default = function(fromNode, ...) {
stop("Invalid object. Must supply node object.")
}
createRel.node = function(fromNode, type, toNode, ...) {
params = list(...)
# Check if toNode is a node.
stopifnot("node" %in% class(toNode))
# Making REST API calls through RCurl and stuff.
}
The ... allows the user to add an arbitrary amount of properties to the relationship in the form key = value. For example,
rel = createRel(alice, "KNOWS", bob, since = 2000, through = "Work")
This creates an (Alice)-[KNOWS]->(Bob) relationship in the db, with the properties since and through and their respective values. However, if a user specifies properties with keys from or to in the ... argument, R gets confused about the classes of fromNode and toNode.
Specifying a property with key from creates confusion about the class of fromNode. It is using createRel.default:
> createRel(alice, "KNOWS", bob, from = "Work")
Error in createRel.default(alice, "KNOWS", bob, from = "Work") :
Invalid object. Must supply node object.
3 stop("Invalid object. Must supply node object.")
2 createRel.default(alice, "KNOWS", bob, from = "Work")
1 createRel(alice, "KNOWS", bob, from = "Work")
Similarly, if a user specifies a property with key to, there is confusion about the class of toNode, and stops at the stopifnot():
Error: "node" %in% class(toNode) is not TRUE
4 stop(sprintf(ngettext(length(r), "%s is not TRUE", "%s are not all TRUE"),
ch), call. = FALSE, domain = NA)
3 stopifnot("node" %in% class(toNode))
2 createRel.node(alice, "KNOWS", bob, to = "Something")
1 createRel(alice, "KNOWS", bob, to = "Something")
I've found that explicitly setting the parameters in createRel works fine:
rel = createRel(fromNode = alice,
type = "KNOWS",
toNode = bob,
from = "Work",
to = "Something")
# OK
But I am wondering how I need to edit my createRel function so that the following syntax will work without error:
rel = createRel(alice, "KNOWS", bob, from = "Work", to = "Something")
# Errors galore.
The GitHub user who opened the issue mentioned it is most likely a conflict with setAs on dispatch, which has arguments called from and to. One solution is to get rid of ... and change createRel to the following:
createRel = function(fromNode, type, toNode, params = list()) {
UseMethod("createRel")
}
createRel.default = function(fromNode, ...) {
stop("Invalid object. Must supply node object.")
}
createRel.node = function(fromNode, type, toNode, params = list()) {
# Check if toNode is a node.
stopifnot("node" %in% class(toNode))
# Making REST API calls through RCurl and stuff.
}
But, I wanted to see if I had any other options before making this change.
Not really an answer, but...
The problem is that the user-provided argument 'from' is being (partially) matched to the formal argument 'fromNode'.
f = function(fromNode, ...) fromNode
f(1, from=2)
## [1] 2
The rules are outlined in section 4.3.2 of RShowDoc('R-lang'), where named arguments are exact matched, then partial matched, and then unnamed arguments are assigned by position.
It's hard to know how to enforce exact matching, other than using single-letter argument names! Actually, for a generic this might not be as trite as it sounds -- x is a pretty generic variable name. If 'from' and 'to' were common arguments to ... you could change the argument list to "fromNode, , ..., from, to", check for missing(from) in the body of the function, and act accordingly; I don't think this would be pleasant, and the user would invariable provide an argument 'fro'.
While enforcing exact matching (and errors, via warn=2) by setting global options() might be helpful in debugging (though by then you'd probably know what you were looking for!) it doesn't help the package author who is trying to write code to work for users in general.
It might be reasonable to ask on the R-devel mailing list whether it might be time for this behavior to be changed (on the 'several releases' time scale); partial matching probably dates as a 'convenience' from the days before tab completion.

Order of methods in R reference class and multiple files

There is one thing I really don't like about R reference class: the order you write the methods matters. Suppose your class goes like this:
myclass = setRefClass("myclass",
fields = list(
x = "numeric",
y = "numeric"
))
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
bfunc(i)
}
)
myclass$methods(
bfunc = function(i) {
message("In bfunc, I just call cfunc...")
cfunc(i)
}
)
myclass$methods(
cfunc = function(i) {
message("In cfunc, I print out the sum of i, x and y...")
message(paste("i + x + y = ", i+x+y))
}
)
myclass$methods(
initialize = function(x, y) {
x <<- x
y <<- y
}
)
And then you start an instance, and call a method:
x = myclass(5, 6)
x$afunc(1)
You will get an error:
Error in x$afunc(1) : could not find function "bfunc"
I am interested in two things:
Is there a way to work around this nuisance?
Does this mean I can never split a really long class file into multiple files? (e.g. one file for each method.)
Calling bfunc(i) isn't going to invoke the method since it doesn't know what object it is operating on!
In your method definitions, .self is the object being methodded on (?). So change your code to:
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
.self$bfunc(i)
}
)
(and similarly for bfunc). Are you coming from C++ or some language where functions within methods are automatically invoked within the object's context?
Some languages make this more explicit, for example in Python a method with one argument like yours actually has two arguments when defined, and would be:
def afunc(self, i):
[code]
but called like:
x.afunc(1)
then within the afunc there is the self variable which referes to x (although calling it self is a universal convention, it could be called anything).
In R, the .self is a little bit of magic sprinkled over reference classes. I don't think you could change it to .this even if you wanted.

Modify contents of object with "call by reference"

I am trying to modify the contents of an object defined by a self-written class with a function that takes two objects of this class and adds the contents.
setClass("test",representation(val="numeric"),prototype(val=1))
I know that R not really works with "call by reference" but can mimic that behaviour with a method like this one:
setGeneric("value<-", function(test,value) standardGeneric("value<-"))
setReplaceMethod("value",signature = c("test","numeric"),
definition=function(test,value) {
test#val <- value
test
})
foo = new("test") #foo#val is 1 per prototype
value(foo)<-2 #foo#val is now set to 2
Until here, anything I did and got as result is consitent with my research here on stackexchange,
Call by reference in R (using function to modify an object)
and with this code from a lecture (commented and written in German)
What I wish to achieve now is a similar result with the following method:
setGeneric("add<-", function(testA,testB) standardGeneric("add<-"))
setReplaceMethod("add",signature = c("test","test"),
definition=function(testA,testB) {
testA#val <- testA#val + testB#val
testA
})
bar = new("test")
add(foo)<-bar #should add the value slot of both objects and save the result to foo
Instead I get the following error:
Error in `add<-`(`*tmp*`, value = <S4 object of class "test">) :
unused argument (value = <S4 object of class "test">)
The function call works with:
"add<-"(foo,bar)
But this does not save the value into foo. Using
foo <- "add<-"(foo,bar)
#or using
setMethod("add",signature = c("test","test"), definition= #as above... )
foo <- add(foo,bar)
works but this is inconsistent with the modifying method value(foo)<-2
I have the feeling that I am missing something simple here.
Any help is very much appreciated!
I do not remember why, but for <- functions, the last argument must be named 'value'.
So in your case:
setGeneric("add<-", function(testA,value) standardGeneric("add<-"))
setReplaceMethod("add",signature = c("test","test"),
definition=function(testA,value) {
testA#val <- testA#val + value#val
testA
})
bar = new("test")
add(foo)<-bar
You may also use a Reference class ig you want to avoid the traditional arguments as values thing.

Resources