How is it decided if print uses a prefix

How is it decided if print uses a prefix - r

I'm wondering about a very basic behaviour of the print() function for which I was unable to find an explanation.
If I add a name() to my vector, I don't get a prefix ([1]). Minimal example:
x <- 1
names(x) <- "name"
print(x)
y<-2
print(y)
Output:
print(x)
name
1
and
print(y)
[1] 2
I was wondering if names() changes the class or something like by adding the attribute. But typeof() and class() do provide the same value for x and y. So I think it is the print() function, which does not give a prefix as output when an attribute is given. When does print() use the prefix [x] and when does it not?

In the first case object x is a numeric vector and has no attributes. In the second - object y is also a numeric vector, but has an attribute "names":
x <- 1
attributes(x)
# NULL
y <- 1
names(y) <- "value"
attributes(y)
# $`names`
# [1] "value"
In both cases print.default() method is used to display the value of an object. This function calls .Internal(print.default(x, digits, quote, na.print, print.gap,
right, max, useSource, noOpt)).
Looking at the source code of this function (for example here: https://github.com/wch/r-source/blob/trunk/src/main/print.c) we can see that depending on whether the vector has attributes or not the output contains indexes of the first elements on each output line or not:
if((dims = getAttrib(s, R_DimSymbol)) != R_NilValue && length(dims) > 1) {
...
}
else { // no dim()
...
sprintf(ptag, "[[%d]]", i+1);
...
}

Related

Getting the name of the assigned variable to an argument in a function

I'd like to state the variable that has been assigned to an argument in a function. Function is more complex, but an example scenario is below, where I want the function to return the name of the variable: x
e.g.
x <- 3
my_function <- function(arg1) {print(arg1)}
where: my_function(x) will return x
and: my_function(y) will return y
Is this possible?

Use substitute to return a symbol object. If we wrap with deparse on the substitute, it returns a character
my_function <- function(arg1) substitute(arg1)
-testing
> my_function(x)
x
> my_function(y)
y

passing vector to function in R

I'm trying to create a function that subtracts 2 from each element of a vector, and whenever I pass a vector as a parameter to the function, it's outputting an error:
Error in sub(x) : argument "x" is missing, with no default.
so I have a vector that's called x1,
and my function call looks like that: sub(x1)
any help will be appreciated.
sub <- function(x)
{
for(i in 1:length(x))
{
x[i] = x[i]-2
}
return(x)
}

In R a lot of function and operators (just a special form of functions) are vectorised. Vectorisation means that a function/operator works automatically on all elements of an vector (or vector like object).
Therefore, our problem can be solved with much less code. In addition using vectorised functions (especially basic stuff like +, -, ...) is much much much faster than looping over elements.
# define function that does subtraction
sub <- function(x){
x - 2
}
# define vector with numbers ranging from 1 to 20
my_vector <- 1:20
# call function with my_vector as argument
sub(my_vector)
In regard to your error:
Error in sub(x) : argument "x" is missing, with no default.
It is telling you that you called a function sub() without providing an appropriate value for its argument x. Since you did not provide it, and there is no default, and it cannot find it otherwise R does not know what to do and signals (throws) an error.
I can reproduce your error like so:
# call sub without argument
sub()
## Error in sub() : argument "x" is missing, with no default
I can prevent it by providing a value for argument x, like so:
# call sub with value for x
sub(1)
sub(x = 1)
... Or I can provide defaults like this:
# define function with default values
sub <- function(x = NULL){
x - 2
}
# call new 'robust' sub() function without arguments
sub()
## numeric(0)
... Or I can provide defaults like this:
# define function with default values
sub <- function(x){
if ( missing(x) ){
x <- NULL
}
x - 2
}
# call new 'robust' sub() function without arguments
sub()
## numeric(0)
Resources:
https://www.youtube.com/watch?v=M4fMccWy5lU
https://www.stat.berkeley.edu/~statcur/Workshop2/Presentations/functions.pdf
http://adv-r.had.co.nz/Functions.html
?`function`
https://cran.r-project.org/doc/manuals/r-patched/R-intro.html#Writing-your-own-functions

I suppose you forgot to run your function definition:
sub2 <- function(x)
{
for(i in 1:length(x))
{
x[i] = x[i]-2
}
return(x)
}
sub2(1:4) ## works fine
sub(1:4) ## Error calling the function sub(pattern, replacement, x, ...)
Error in sub(1:4) : argument "x" is missing, with no default
or
> x1 <- 1:4
> sub(x1) ## Error
Error in sub(x1) : argument "x" is missing, with no default
If you would have choosen another name for your function (not a name of an existing R-function) the message is clear (to run in a new R-session):
# sub2 <- function(x)
# {
# for(i in 1:length(x))
# {
# x[i] = x[i]-2
# }
# return(x)
# }
sub2(1:4)
# > sub2(1:4)
# Error in sub2(1:4) : could not find function "sub2"
I commented out the function definition to simulate not running of the function definition

What are these square brackets for S3 classes?

I obtained this from an open source repo on git. This shows the writing of generic and methods for S3 classes. But I do not understand the notations or conventions that the functions are being assigned to. The following are my questions:
The use backticks `` to define the function name. Usually we wouldn't use backticks or even double quotes to assign variables/functions but I see this happening a lot of times. Is this a naming convention?
Why is the . included before the blob name? Usually wouldn't it just be called blob and a method would be method.blob?
Why are there [ brackets there? Especially, [<- and [[<-. Are we performing some sort of double asigning?
Hopefully someone will be able to shed some light on what is ha
#' #export
`[.blob` <- function(x, i, ...) {
new_blob(NextMethod())
}
#' #export
`[<-.blob` <- function(x, i, ..., value) {
if (!is_raw_list(value)) {
stop("RHS must be list of raw vectors", call. = FALSE)
}
NextMethod()
}
#' #export
`[[<-.blob` <- function(x, i, ..., value) {
if (!is.raw(value) && !is.null(value)) {
stop("RHS must be raw vector or NULL", call. = FALSE)
}
if (is.null(value)) {
x[i] <- list(NULL)
x
} else {
NextMethod()
}
}

Summary
If you're creating a new object in R for which you want 'different' subset and assignment behaviour, you should create the associated methods for these operations.
The . IS working in the way you're expecting - method dispatch
[.blob is overriding the S3 [ subset operator
[<-.blob is overriding the S3 [<- operator (i.e. vector-subset assignment)
[[<-.blob is overriding the S3 [[<- operator (i.e. list-subset assignment)
Special symbols (e.g., backticks, brackets, percent-sign, variables with spaces in the name) cannot be "assigned to" by default. To do so, if you surround it in backticks, it can work. As an example, a variable named A B cannot be assigned with A B <- 1, whereas `A B` <- 1 works (credit #r2evans)
Examples
subset
Taking [.blob as an example, this allows you to create your own subset operation for your blob object.
## Create your own blob object (class)
blob <- 1:5
attr(blob, "class") <- "blob"
## create a subset operator, which in this example just calls the next method in the s3 dispatch chain
`[.blob` <- function(x, i, j, ...) NextMethod()
As we're not doing anything special in our own subset method, this works like normal R vectors
blob[3]
# [1] 3
However, we can make the subset operation do whatever we want, for example always return the 1st element of the vector
## define the function to always subset the first element
`[.blob` <- function(x, i, j, ...) { i = 1; NextMethod() }
Now your blob object will only ever return the 1st element.
blob[1]
# [1] 1
blob[2]
# [1] 1
blob[3]
# [1] 1
Assignment
Similarly for one of the assignment operators, if you overload [<- with
`[<-.blob` <- function(x, i, j, ...) { i = 5; NextMethod() }
This will always assign the 5th element of your blob object with the new value
blob[1] <- 100
blob
# [1] 1 2 3 4 100
# attr(,"class")
# [1] "blob"
Back ticks
The back-ticks are used so we can assign functions/variables to special symbols.
For example, try to assign a vector to the [ symbol
[ <- 1:5
# Error: unexpected '[' in "["
Whereas surrounding it with ticks lets it pass (although this example is not recommended)
`[` <- 1:5
`[`
# [1] 1 2 3 4 5

deparse(substitute()) returns function name normally, but function code when called inside for loop

I'm a bit surprised by R's behaviour in a very specific case. Let's say I define a function square that returns the square of its argument, like this:
square <- function(x) { return(x^2) }
I want to call this function within another function, and I also want to display its name when I do that. I can do that using deparse(substitute()). However, consider the following examples:
ds1 <- function(x) {
print(deparse(substitute(x)))
}
ds1(square)
# [1] "square"
This is the expected output, so all is fine. However, if I pass the function wrapped in a list and process it using a for loop, the following happens:
ds2 <- function(x) {
for (y in x) {
print(deparse(substitute(y)))
}
}
ds2(c(square))
# [1] "function (x) " "{" " return(x^2)" "}"
Can anybody explain to me why this occurs and how I could prevent it from happening?

As soon as you use x inside your function, it is evaluated, so it "stops being an (unevaluated) expression" and "starts being its resulting values (evaluated expression)". To prevent this, you must capture x by substitute before you use it for the first time.
The result of substitute is an object which you can query as if it was a list. So you can use
x <- substitute(x)
and then x[[1]] (the function name) and x[[2]] and following (the arguments of the function)
So this works:
ds2 <- function(x) {
x <- substitute(x)
# you can do `x[[1]]` but you can't use the expression object x in a
# for loop. So you have to turn it into a list first
for (y in as.list(x)[-1]) {
print(deparse(y))
}
}
ds2(c(square,sum))
## [1] "square"
## [1] "sum"

Is it possible to modify an object on a list in a parent frame in R?

I'm working on an R package that has a number of functions that follow a non-R-standard practice of modifying in place the object passed in as an argument. This normally works OK, but fails when the object to be modified is on a list.
An function to give an example of the form of the assignments:
myFun<-function(x){
xn <- deparse(substitute(x))
ev <- parent.frame()
# would do real stuff here ..
# instead set simple value to modify local copy
x[[1]]<-"b"
# assign in parent frame
if (exists(xn, envir = ev))
on.exit(assign(xn, x, pos = ev))
# return invisibly
invisible(x)
}
This works:
> myObj <-list("a")
> myFun(myObj)
> myObj
[[1]]
[1] "b"
But it does not work if the object is a member of a list:
> myObj <-list("a")
> myList<-list(myObj,myObj)
> myFun(myList[[1]])
> myList
[[1]]
[[1]][[1]]
[1] "a"
[[2]]
[[2]][[1]]
[1] "a"
After reading answers to other questions here, I see the docs for assign clearly state:
assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.
Since there is an existing codebase using these functions, we cannot abandon the modify-in-place syntax. Does anyone have suggestions for workarounds or alternative approaches for modifying objects which are members of a list in a parent frame?
UPDATE:
I've considered trying to roll my own assignment function, something like:
assignToListInEnv<-function(name,env,value){
# assume name is something like "myList[[1]]"
#check for brackets
index<-regexpr('[[',name,fixed=TRUE)[1]
if(index>0){
lname<-substr(name,0,index-1)
#check that it exists
if (exists(lname,where=env)){
target<-get(lname,pos=env)
# make sure it is a list
if (is.list(target)){
eval(parse(text=paste('target',substr(name,index,999),'<-value',sep='')))
assign(lname, target, pos = env)
} else {
stop('object ',lname,' is not a list in environment ',env)
}
} else {
stop('unable to locate object ',lname,' in frame ',env)
}
}
}
But it seems horrible brittle, would need to handle many more cases ($ and [ as well as [[) and would probably still fail for [[x]] because x would be evaluated in the wrong frame...

Since it was in the first search results to my query, here's my solution :
You can use paste() with "<<-" to create an expression which will assign the value to your list element when evaluated.
assignToListInEnv<-function(name, value, env = parent.frame()){
cl <- as.list(match.call())
lang <- str2lang(paste(cl["name"], "<<-", cl["value"]))
eval(lang, envir = env)
}
EDIT : revisiting this answer because it got a vote up
I'm not sure why I used <<- instead of <-. If using the 'env' argument, <<-with assign to the parent.frame of that env.
So if you always want it to be the first parent.frame it can just be :
assignToListInParentFrame<-function(name, value){
cl <- as.list(match.call())
paste(cl["name"], "<<-", cl["value"]) |>
str2lang() |>
eval()
}
and if you want to precise in which env to modify the list :
assignToListInEnv<-function(name, value, env){
cl <- as.list(match.call())
paste(cl["name"], "<-", cl["value"]) |>
str2lang() |>
eval(envir = env)
}