What are these square brackets for S3 classes? - r

I obtained this from an open source repo on git. This shows the writing of generic and methods for S3 classes. But I do not understand the notations or conventions that the functions are being assigned to. The following are my questions:
The use backticks `` to define the function name. Usually we wouldn't use backticks or even double quotes to assign variables/functions but I see this happening a lot of times. Is this a naming convention?
Why is the . included before the blob name? Usually wouldn't it just be called blob and a method would be method.blob?
Why are there [ brackets there? Especially, [<- and [[<-. Are we performing some sort of double asigning?
Hopefully someone will be able to shed some light on what is ha
#' #export
`[.blob` <- function(x, i, ...) {
new_blob(NextMethod())
}
#' #export
`[<-.blob` <- function(x, i, ..., value) {
if (!is_raw_list(value)) {
stop("RHS must be list of raw vectors", call. = FALSE)
}
NextMethod()
}
#' #export
`[[<-.blob` <- function(x, i, ..., value) {
if (!is.raw(value) && !is.null(value)) {
stop("RHS must be raw vector or NULL", call. = FALSE)
}
if (is.null(value)) {
x[i] <- list(NULL)
x
} else {
NextMethod()
}
}

Summary
If you're creating a new object in R for which you want 'different' subset and assignment behaviour, you should create the associated methods for these operations.
The . IS working in the way you're expecting - method dispatch
[.blob is overriding the S3 [ subset operator
[<-.blob is overriding the S3 [<- operator (i.e. vector-subset assignment)
[[<-.blob is overriding the S3 [[<- operator (i.e. list-subset assignment)
Special symbols (e.g., backticks, brackets, percent-sign, variables with spaces in the name) cannot be "assigned to" by default. To do so, if you surround it in backticks, it can work. As an example, a variable named A B cannot be assigned with A B <- 1, whereas `A B` <- 1 works (credit #r2evans)
Examples
subset
Taking [.blob as an example, this allows you to create your own subset operation for your blob object.
## Create your own blob object (class)
blob <- 1:5
attr(blob, "class") <- "blob"
## create a subset operator, which in this example just calls the next method in the s3 dispatch chain
`[.blob` <- function(x, i, j, ...) NextMethod()
As we're not doing anything special in our own subset method, this works like normal R vectors
blob[3]
# [1] 3
However, we can make the subset operation do whatever we want, for example always return the 1st element of the vector
## define the function to always subset the first element
`[.blob` <- function(x, i, j, ...) { i = 1; NextMethod() }
Now your blob object will only ever return the 1st element.
blob[1]
# [1] 1
blob[2]
# [1] 1
blob[3]
# [1] 1
Assignment
Similarly for one of the assignment operators, if you overload [<- with
`[<-.blob` <- function(x, i, j, ...) { i = 5; NextMethod() }
This will always assign the 5th element of your blob object with the new value
blob[1] <- 100
blob
# [1] 1 2 3 4 100
# attr(,"class")
# [1] "blob"
Back ticks
The back-ticks are used so we can assign functions/variables to special symbols.
For example, try to assign a vector to the [ symbol
[ <- 1:5
# Error: unexpected '[' in "["
Whereas surrounding it with ticks lets it pass (although this example is not recommended)
`[` <- 1:5
`[`
# [1] 1 2 3 4 5

Related

How is it decided if print uses a prefix

I'm wondering about a very basic behaviour of the print() function for which I was unable to find an explanation.
If I add a name() to my vector, I don't get a prefix ([1]). Minimal example:
x <- 1
names(x) <- "name"
print(x)
y<-2
print(y)
Output:
print(x)
name
1
and
print(y)
[1] 2
I was wondering if names() changes the class or something like by adding the attribute. But typeof() and class() do provide the same value for x and y. So I think it is the print() function, which does not give a prefix as output when an attribute is given. When does print() use the prefix [x] and when does it not?
In the first case object x is a numeric vector and has no attributes. In the second - object y is also a numeric vector, but has an attribute "names":
x <- 1
attributes(x)
# NULL
y <- 1
names(y) <- "value"
attributes(y)
# $`names`
# [1] "value"
In both cases print.default() method is used to display the value of an object. This function calls .Internal(print.default(x, digits, quote, na.print, print.gap,
right, max, useSource, noOpt)).
Looking at the source code of this function (for example here: https://github.com/wch/r-source/blob/trunk/src/main/print.c) we can see that depending on whether the vector has attributes or not the output contains indexes of the first elements on each output line or not:
if((dims = getAttrib(s, R_DimSymbol)) != R_NilValue && length(dims) > 1) {
...
}
else { // no dim()
...
sprintf(ptag, "[[%d]]", i+1);
...
}

passing vector to function in R

I'm trying to create a function that subtracts 2 from each element of a vector, and whenever I pass a vector as a parameter to the function, it's outputting an error:
Error in sub(x) : argument "x" is missing, with no default.
so I have a vector that's called x1,
and my function call looks like that: sub(x1)
any help will be appreciated.
sub <- function(x)
{
for(i in 1:length(x))
{
x[i] = x[i]-2
}
return(x)
}
In R a lot of function and operators (just a special form of functions) are vectorised. Vectorisation means that a function/operator works automatically on all elements of an vector (or vector like object).
Therefore, our problem can be solved with much less code. In addition using vectorised functions (especially basic stuff like +, -, ...) is much much much faster than looping over elements.
# define function that does subtraction
sub <- function(x){
x - 2
}
# define vector with numbers ranging from 1 to 20
my_vector <- 1:20
# call function with my_vector as argument
sub(my_vector)
In regard to your error:
Error in sub(x) : argument "x" is missing, with no default.
It is telling you that you called a function sub() without providing an appropriate value for its argument x. Since you did not provide it, and there is no default, and it cannot find it otherwise R does not know what to do and signals (throws) an error.
I can reproduce your error like so:
# call sub without argument
sub()
## Error in sub() : argument "x" is missing, with no default
I can prevent it by providing a value for argument x, like so:
# call sub with value for x
sub(1)
sub(x = 1)
... Or I can provide defaults like this:
# define function with default values
sub <- function(x = NULL){
x - 2
}
# call new 'robust' sub() function without arguments
sub()
## numeric(0)
... Or I can provide defaults like this:
# define function with default values
sub <- function(x){
if ( missing(x) ){
x <- NULL
}
x - 2
}
# call new 'robust' sub() function without arguments
sub()
## numeric(0)
Resources:
https://www.youtube.com/watch?v=M4fMccWy5lU
https://www.stat.berkeley.edu/~statcur/Workshop2/Presentations/functions.pdf
http://adv-r.had.co.nz/Functions.html
?`function`
https://cran.r-project.org/doc/manuals/r-patched/R-intro.html#Writing-your-own-functions
I suppose you forgot to run your function definition:
sub2 <- function(x)
{
for(i in 1:length(x))
{
x[i] = x[i]-2
}
return(x)
}
sub2(1:4) ## works fine
sub(1:4) ## Error calling the function sub(pattern, replacement, x, ...)
Error in sub(1:4) : argument "x" is missing, with no default
or
> x1 <- 1:4
> sub(x1) ## Error
Error in sub(x1) : argument "x" is missing, with no default
If you would have choosen another name for your function (not a name of an existing R-function) the message is clear (to run in a new R-session):
# sub2 <- function(x)
# {
# for(i in 1:length(x))
# {
# x[i] = x[i]-2
# }
# return(x)
# }
sub2(1:4)
# > sub2(1:4)
# Error in sub2(1:4) : could not find function "sub2"
I commented out the function definition to simulate not running of the function definition

Make return from S3 indexing function "[" invisible

Is it possible to return an invisible object when using the S3 indexing function "[" on a custom class? For example, in the code below, is there a way to make the last line of code not print anything?
mat <- function(x) {
structure(x, class="mat")
}
"[.mat" <- function(x, i, j) {
invisible(unclass(x)[i,j])
}
m1 <- mat(matrix(1:10, ncol=2))
m1[1:2,]
[,1] [,2]
[1,] 1 6
[2,] 2 7
You are running into issues with the visibility mechanism caused by primitive functions. Consider:
> length.x <- function(x) invisible(23)
> length(structure(1:10, class="x"))
[1] 23
> mean.x <- function(x) invisible(23)
> mean(structure(1:10, class="x"))
> # no output
length is a primitive, but mean is not. From R Internals:
Whether the returned value of a top-level R expression is printed is controlled by the global boolean variable R_Visible. This is set (to true or false) on entry to all primitive and internal functions based on the eval column of the table in file src/main/names.c: the appropriate setting can be extracted by the macro PRIMPRINT.
and
Internal and primitive functions force the documented setting of R_Visible on return, unless the C code is allowed to change it (the exceptions above are indicated by PRIMPRINT having value 2).
So it would seem that you cannot force invisible returns from primitive generics like [, length, etc., and you must resort to workarounds like the one suggested by Alex.
The problem is that the value returned from [.mat is not of class mat since you're using unclass, so it uses the default printing method for whatever class it has. To fix this, just ensure that the returned object is still a mat and define a printing method for mat objects.
mat <- function(x) {
class(x) <- "mat"
x
}
`[.mat` <- function(x, i, j) {
y <- mat(unclass(x)[i, j])
invisible(y)
}
print.mat <- function(x, ...) {
invisible(x)
}
test <- mat(matrix(1:10, ncol = 2))
test[1, 1]
# Nothing is printed

Is it possible to modify an object on a list in a parent frame in R?

I'm working on an R package that has a number of functions that follow a non-R-standard practice of modifying in place the object passed in as an argument. This normally works OK, but fails when the object to be modified is on a list.
An function to give an example of the form of the assignments:
myFun<-function(x){
xn <- deparse(substitute(x))
ev <- parent.frame()
# would do real stuff here ..
# instead set simple value to modify local copy
x[[1]]<-"b"
# assign in parent frame
if (exists(xn, envir = ev))
on.exit(assign(xn, x, pos = ev))
# return invisibly
invisible(x)
}
This works:
> myObj <-list("a")
> myFun(myObj)
> myObj
[[1]]
[1] "b"
But it does not work if the object is a member of a list:
> myObj <-list("a")
> myList<-list(myObj,myObj)
> myFun(myList[[1]])
> myList
[[1]]
[[1]][[1]]
[1] "a"
[[2]]
[[2]][[1]]
[1] "a"
After reading answers to other questions here, I see the docs for assign clearly state:
assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.
Since there is an existing codebase using these functions, we cannot abandon the modify-in-place syntax. Does anyone have suggestions for workarounds or alternative approaches for modifying objects which are members of a list in a parent frame?
UPDATE:
I've considered trying to roll my own assignment function, something like:
assignToListInEnv<-function(name,env,value){
# assume name is something like "myList[[1]]"
#check for brackets
index<-regexpr('[[',name,fixed=TRUE)[1]
if(index>0){
lname<-substr(name,0,index-1)
#check that it exists
if (exists(lname,where=env)){
target<-get(lname,pos=env)
# make sure it is a list
if (is.list(target)){
eval(parse(text=paste('target',substr(name,index,999),'<-value',sep='')))
assign(lname, target, pos = env)
} else {
stop('object ',lname,' is not a list in environment ',env)
}
} else {
stop('unable to locate object ',lname,' in frame ',env)
}
}
}
But it seems horrible brittle, would need to handle many more cases ($ and [ as well as [[) and would probably still fail for [[x]] because x would be evaluated in the wrong frame...
Since it was in the first search results to my query, here's my solution :
You can use paste() with "<<-" to create an expression which will assign the value to your list element when evaluated.
assignToListInEnv<-function(name, value, env = parent.frame()){
cl <- as.list(match.call())
lang <- str2lang(paste(cl["name"], "<<-", cl["value"]))
eval(lang, envir = env)
}
EDIT : revisiting this answer because it got a vote up
I'm not sure why I used <<- instead of <-. If using the 'env' argument, <<-with assign to the parent.frame of that env.
So if you always want it to be the first parent.frame it can just be :
assignToListInParentFrame<-function(name, value){
cl <- as.list(match.call())
paste(cl["name"], "<<-", cl["value"]) |>
str2lang() |>
eval()
}
and if you want to precise in which env to modify the list :
assignToListInEnv<-function(name, value, env){
cl <- as.list(match.call())
paste(cl["name"], "<-", cl["value"]) |>
str2lang() |>
eval(envir = env)
}

Finding the names of all functions in an R expression

I'm trying to find the names of all the functions used in an arbitrary legal R expression, but I can't find a function that will flag the below example as a function instead of a name.
test <- expression(
this_is_a_function <- function(var1, var2){
this_is_a_function(var1-1, var2)
})
all.vars(test, functions = FALSE)
[1] "this_is_a_function" "var1" "var2"
all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).
Is there any function - in the core libraries or elsewhere - that will flag 'this_is_a_function' as a function, not a name? It needs to work on arbitrary expressions, that are syntactically legal but might not evaluate correctly (e.g '+'(1, 'duck'))
I've found similar questions, but they don't seem to contain the solution.
If clarification is needed, leave a comment below. I'm using the parser package to parse the expressions.
Edit: #Hadley
I have expressions with contain entire scripts, which usually consist of a main function containing nested function definitions, with a call to the main function at the end of the script.
Functions are all defined inside the expressions, and I don't mind if I have to include '<-' and '{', since I can easy filter them out myself.
The motivation is to take all my R scripts and gather basic statistics about how my use of functions has changed over time.
Edit: Current Solution
A Regex-based approach grabs the function definitions, combined with the method in James' comment to grab function calls. Usually works, since I never use right-hand assignment.
function_usage <- function(code_string){
# takes a script, extracts function definitions
require(stringr)
code_string <- str_replace(code_string, 'expression\\(', '')
equal_assign <- '.+[ \n]+<-[ \n]+function'
arrow_assign <- '.+[ \n]+=[ \n]+function'
function_names <- sapply(
strsplit(
str_match(code_string, equal_assign), split = '[ \n]+<-'),
function(x) x[1])
function_names <- c(function_names, sapply(
strsplit(
str_match(code_string, arrow_assign), split = '[ \n]+='),
function(x) x[1]))
return(table(function_names))
}
Short answer: is.function checks whether a variable actually holds a function. This does not work on (unevaluated) calls because they are calls. You also need to take care of masking:
mean <- mean (x)
Longer answer:
IMHO there is a big difference between the two occurences of this_is_a_function.
In the first case you'll assign a function to the variable with name this_is_a_function once you evaluate the expression. The difference is the same difference as between 2+2 and 4.
However, just finding <- function () does not guarantee that the result is a function:
f <- function (x) {x + 1} (2)
The second occurrence is syntactically a function call. You can determine from the expression that a variable called this_is_a_function which holds a function needs to exist in order for the call to evaluate properly. BUT: you don't know whether it exists from that statement alone. however, you can check whether such a variable exists, and whether it is a function.
The fact that functions are stored in variables like other types of data, too, means that in the first case you can know that the result of function () will be function and from that conclude that immediately after this expression is evaluated, the variable with name this_is_a_function will hold a function.
However, R is full of names and functions: "->" is the name of the assignment function (a variable holding the assignment function) ...
After evaluating the expression, you can verify this by is.function (this_is_a_function).
However, this is by no means the only expression that returns a function: Think of
f <- function () {g <- function (){}}
> body (f)[[2]][[3]]
function() {
}
> class (body (f)[[2]][[3]])
[1] "call"
> class (eval (body (f)[[2]][[3]]))
[1] "function"
all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).
I'd say it is the other way round: in that expression f is the variable (name) which will be asssigned the function (once the call is evaluated). + (1, 2) evaluates to a numeric. Unless you keep it from doing so.
e <- expression (1 + 2)
> e <- expression (1 + 2)
> e [[1]]
1 + 2
> e [[1]][[1]]
`+`
> class (e [[1]][[1]])
[1] "name"
> eval (e [[1]][[1]])
function (e1, e2) .Primitive("+")
> class (eval (e [[1]][[1]]))
[1] "function"
Instead of looking for function definitions, which is going to be effectively impossible to do correctly without actually evaluating the functions, it will be easier to look for function calls.
The following function recursively spiders the expression/call tree returning the names of all objects that are called like a function:
find_calls <- function(x) {
# Base case
if (!is.recursive(x)) return()
recurse <- function(x) {
sort(unique(as.character(unlist(lapply(x, find_calls)))))
}
if (is.call(x)) {
f_name <- as.character(x[[1]])
c(f_name, recurse(x[-1]))
} else {
recurse(x)
}
}
It works as expected for a simple test case:
x <- expression({
f(3, g())
h <- function(x, y) {
i()
j()
k(l())
}
})
find_calls(x)
# [1] "{" "<-" "f" "function" "g" "i" "j"
# [8] "k" "l"
Just to follow up here as I have also been dealing with this problem: I have now created a C-level function to do this using code very similar to the C implementation of all.names and all.vars in base R. It however only works with objects of type "language" i.e. function calls, not type "expression". Demonstration:
ex = quote(sum(x) + mean(y) / z)
all.names(ex)
#> [1] "+" "sum" "x" "/" "mean" "y" "z"
all.vars(ex)
#> [1] "x" "y" "z"
collapse::all_funs(ex)
#> [1] "+" "sum" "/" "mean"
Created on 2022-08-17 by the reprex package (v2.0.1)
This generalizes to arbitrarily complex nested calls.

Resources