Create Function with special character R - r

I would like to create a function with the percentage sign in r. Something similar to the pipe operator in magrittr ($>$).
Here the code
%|%(x) <- function(x){...}
Unfortunately I received the following error:
Error: unexpected SPECIAL in "%|%"
Is there anything I am missing?
Thank you for your help

Syntactically invalid names need to be wrapped in backticks (`…`) to be used in code. This includes operators when using them as regular R names rather than infix operators. This is the case when you want to define them:
`%|%` <- function(a, b) a + b
It’s also the case when you want to pass them into a higher-order function such as sapply:
sapply(1 : 5, `-`)
# [1] -1 -2 -3 -4 -5
(Of course this particular example is pretty useless since most operators are vectorised so you could just write - (1 : 5) instead of the above.)
You might also see code that uses quotes instead of backticks but this is discouraged.

Related

Can you create an R function that calls using a prefix and suffix (operating like brackets)?

I have read about prefix functions and infix functions on Hadley Wickham's Advanced R website. I would like to know if there is any way to define functions that are called by placing a prefix and suffix around a single argument, so that the prefix and suffix operate like brackets. Is there any way to create a function like this, and if so, how do you do it?
An example for formulation: In order to give a specific example for formulation, suppose you have an object char that is a character string. You want to create a function that is called on a character string using the prefix _# and suffix #_ and the function adds five dashes to the front of the character string. If programmed successfully, it would operate as shown below.
char
[1] "Hello"
_#char#_
[1] "-----Hello"
There is a way to do this as long as your special operator takes a particular form, that is .%_% char %_%. . This is because the parser will interpret the dot as a variable name. If we use non-standard evaluation, we don't need the dot to actually exist, and we only need to use this as a marker for opening and closing our special operator. So we can do something like this:
`%_%` <- function(a, b)
{
if((deparse(match.call()$a) != ".") +
(deparse(match.call()$b) != ".") != 1)
stop("Unrecognised SPECIAL")
if(deparse(match.call()$a == "."))
return(`attr<-`(b, "prepped", TRUE))
if(attr(a, "prepped"))
return(paste0("-----", a))
stop("Unrecognised SPECIAL")
}
.%_% "hello" %_%.
#> [1] "-----hello"
However, this is a weird thing to do in R. It's not idiomatic and uses more keystrokes than a simple function call would. It would also very likely cause unpredictable problems in places where non-standard evaluation is used. This is really just a demo to show that it can be done. Not that it should be done.
Writing a simple function seems like a more R-like solution. If terseness is a priority, then maybe something like
._ <- function(x) paste0("-----", x)
._("hello")
# [1] "-----hello"
Or if you wanted something more bracket-like
.. <- structure(list(NULL), class="dasher")
`[.dasher` <- function(a, x) paste0("-----", x)
..["hello"]
# [1] "-----hello"
Another way to use a custom class would be to redefine the - operator to paste that value in front of the string. For example
literal <- function(x) {class(x)<-"literal"; x}
`-.literal` <- function(e1, e2) {literal(paste0("-", unclass(e1)))}
print.literal <- function(x) print(unclass(x))
Then you can do
val <- literal("hello")
-----val
# [1] "-----hello"
---val
# [1] "---hello"
So here the number of - you type is the number you get in the output.
You can get creative/weird with syntax, but you need to make sure whatever symbols you come up with can be parsed by the parser otherwise you are out-of-luck.

Why can't I combine Reduce with paste when using "*" as a character?

I'm trying to get the output "1*2*4*5" from (function(x) Reduce(paste0(toString("*")),x))(c(1,2,4,5)), but no matter how I manipulate Reduce, paste0, and the asterisks, I'm either getting error messages or the asterisks being treated as multiplication (giving 40). Where am I going wrong?
Reduce uses a function with two arguments to which it applies the previous result and the next element of the vector. Therefore, you need a function of both x and y:
Reduce(function(x,y)paste0(x,"*",y),c(1,2,4,5))
#[1] "1*2*4*5"
As an aside, you can provide an initial value to be applied as x for the first element of the vector with init =.
Reduce(function(x,y)paste0(x,"*",y),c(1,2,4,5), init = 0)
#[1] "0*1*2*4*5"
One thing you may have tried was this:
Reduce(paste0("*"),c(1,2,4,5))
#[1] 40
This applies the multiplication operator to x and y, because paste0("*") evaluates to "*".
Another base R option is to use paste within gsub, e.g.,
x <- 1:5
gsub("\\s","*",Reduce(paste,x))
which gives
> gsub("\\s","*",Reduce(paste,x))
[1] "1*2*3*4*5"
KISS method:
(with improvements as suggested by #nicola)
bar <- as.character(1:5)
paste0(bar,sep="",collapse='*')
#[1] "1*2*3*4*5"

Use of $ and %% operators in R

I have been working with R for about 2 months and have had a little bit of trouble getting a hold of how the $ and %% terms.
I understand I can use the $ term to pull a certain value from a function (e.g. t.test(x)$p.value), but I'm not sure if this is a universal definition. I also know it is possible to use this to specify to pull certain data.
I'm also curious about the use of the %% term, in particular, if I am placing a value in between it (e.g. %x%) I am aware of using it as a modulator or remainder e.g. 7 %% 5 returns 2. Perhaps I am being ignorant and this is not real?
Any help or links to literature would be greatly appreciated.
Note: I have been searching for this for a couple hours so excuse me if I couldn't find it!
You are not really pulling a value from a function but rather from the list object that the function returns. $ is actually an infix that takes two arguments, the values preceding and following it. It is a convenience function designed that uses non-standard evaluation of its second argument. It's called non-standard because the unquoted characters following $ are first quoted before being used to extract a named element from the first argument.
t.test # is the function
t.test(x) # is a named list with one of the names being "p.value"
The value can be pulled in one of three ways:
t.test(x)$p.value
t.test(x)[['p.value']] # numeric vector
t.test(x)['p.value'] # a list with one item
my.name.for.p.val <- 'p.value'
t.test(x)[[ my.name.for.p.val ]]
When you surround a set of characters with flanking "%"-signs you can create your own vectorized infix function. If you wanted a pmax for which the defautl was na.rm=TRUE do this:
'%mypmax%' <- function(x,y) pmax(x,y, na.rm=TRUE)
And then use it without quotes:
> c(1:10, NA) %mypmax% c(NA,10:1)
[1] 1 10 9 8 7 6 7 8 9 10 1
First, the $ operator is for selecting an element of a list. See help('$').
The %% operator is the modulo operator. See help('%%').
The '$' operator is used to select particular element from a list or any other data component which contains sub data components.
For example: data is a list which contains a matrix named MATRIX and other things too.
But to get the matrix we write,
Print(data$MATRIX)
The %% operator is a modulus operator ; which provides the remainder.
For example: print(7%%3)
Will print 1 as an output

Why is := allowed as an infix operator?

I have come across the popular data.table package and one thing in particular intrigued me. It has an in-place assignment operator
:=
This is not defined in base R. In fact if you didn't load the data.table package, it would have raised an error if you had tried to used it (e.g., a := 2) with the message:
Error: could not find function ":="
Also, why does := work? Why does R let you define := as infix operator while every other infix function has to be surrounded by %%, e.g.
`:=` <- function(a, b) {
paste(a,b)
}
"abc" := "def"
Clearly it's not meant to be an alternative syntax to %function.name% for defining infix functions. Is data.table exploiting some parsing quirks of R? Is it a hack? Will it be "patched" in the future?
It is something that the base R parser recognizes and seems to parse as a left assign (at least in terms or order of operations and such). See the C source code for more details.
as.list(parse(text="a:=3")[[1]])
# [[1]]
# `:=`
#
# [[2]]
# a
#
# [[3]]
# [1] 3
As far as I can tell it's undocumented (as far as base R is concerned). But it is a function/operator you can change the behavior of
`:=`<-function(a,b) {a+b}
3 := 7
# [1] 10
As you can see there really isn't anything special about the ":" part itself. It just happens to be the start of a compound token.
It's not just a colon operator but rather := is a single operator formed by the colon and equal sign (just as the combination of "<" and "-" forms the assignment operator in base R). The := operator is an infix function that is defined to be part of the evaluation of the "j" argument inside the [.data.table function. It creates or assigns a value to a column designated by its LHS argument using the result of evaluating its RHS.

R: What are operators like %in% called and how can I learn about them?

I know the basics like == and !=, or even the difference (vaguely) between & and &&. But stuff like %in% and %% and some stuff used in the context of sprintf(), like sprintf("%.2f", x) stuff I have no idea about.
Worst of all, they're hard to search for on the Internet because they're special characters and I don't know what they're called...
There are several different things going on here with the percent symbol:
Binary Operators
As several have already pointed out, things of the form %%, %in%, %*% are binary operators (respectively modulo, match, and matrix multiply), just like a +, -, etc. They are functions that operate on two arguments that R recognizes as being special due to their name structure (starts and ends with a %). This allows you to use them in form:
Argument1 %fun_name% Argument2
instead of the more traditional:
fun_name(Argument1, Argument2)
Keep in mind that the following are equivalent:
10 %% 2 == `%%`(10, 2)
"hello" %in% c("hello", "world") == `%in%`("hello", c("hello", "world"))
10 + 2 == `+`(10, 2)
R just recognizes the standard operators as well as the %x% operators as special and allows you to use them as traditional binary operators if you don't quote them. If you quote them (in the examples above with backticks), you can use them as standard two argument functions.
Custom Binary Operators
The big difference between the standard binary operators and %x% operators is that you can define custom binary operators and R will recognize them as special and treat them as binary operators:
`%samp%` <- function(e1, e2) sample(e1, e2)
1:10 %samp% 2
# [1] 1 9
Here we defined a binary operator version of the sample function
"%" (Percent) as a token in special function
The meaning of "%" in function like sprintf or format is completely different and has nothing to do with binary operators. The key thing to note is that in those functions the % character is part of a quoted string, and not a standard symbol on the command line (i.e. "%" and % are very different). In the context of sprintf, inside a string, "%" is a special character used to recognize that the subsequent characters have a special meaning and should not be interpreted as regular text. For example, in:
sprintf("I'm a number: %.2f", runif(3))
# [1] "I'm a number: 0.96" "I'm a number: 0.74" "I'm a number: 0.99"
"%.2f" means a floating point number (f) to be displayed with two decimals (.2). Notice how the "I'm a number: " piece is interpreted literally. The use of "%" allows sprintf users to mix literal text with special instructions on how to represent the other sprintf arguments.
The R Language Definition, section 3.1.4 refers to them as "special binary operators". One of the ways they're special is that users can define new binary operators using the %x% syntax (where x is any valid name).
The Writing your own functions section of An Introduction to R, refers to them as Binary Operators (which is somewhat confusing because + is also a binary operator):
10.2 Defining new binary operators
Had we given the bslash() function a different name, namely one of the
form
%anything%
it could have been used as a binary operator in expressions
rather than in function form. Suppose, for example, we choose ! for
the internal character. The function definition would then start as
> "%!%" <- function(X, y) { ... }
(Note the use of quote marks.) The function could then be used as X %!% y. (The backslash symbol itself
is not a convenient choice as it presents special problems in this
context.)
The matrix multiplication operator, %*%, and the outer product matrix
operator %o% are other examples of binary operators defined in this
way.
They don’t have a special name as far as I know. They are described in R operator syntax and precedence.
The %anything% operators are just normal functions, which can be defined by yourself. You do need to put the name of the operator in backticks (`…`), though: this is how R treats special names.
`%test%` = function (a, b) a * b
2 %test% 4
# 8
The sprintf format strings are entirely unrelated, they are not operators at all. Instead, they are just the conventional C-style format strings.
The help file, and the general entry, is indeed a good starting point: ?'%in%'
For example, you can see how the operator '%in%' is defined:
"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0
You can even create your own operators:
'%ni%' <- Negate('%in%')

Resources