I am trying to extract values from a vector using numeric vectors expressed in two seemingly equivalent ways:
x <- c(1,2,3)
x[2:3]
# [1] 2 3
x[1+1:3]
# [1] 2 3 NA
I am confused why the expression x[2:3] produces a result different from x[1+1:3] -- the second includes an NA value at the end. What am I missing?
Because the operator : has precedence over + so 1+1:3 is really 1+(1:3) (i. e. 2:4) and not 2:3. Thus, to change the order of execution as defined operator precedence, use parentheses ()
You can see the order of precedence of operators in the help file ?Syntax. Here is the relevant part:
The following unary and binary operators are defined. They are listed in precedence groups, from highest to lowest.
:: ::: access variables in a namespace
$ # component / slot extraction
[ [[ indexing
^ exponentiation (right to left)
- + unary minus and plus
: sequence operator
%any% special operators (including %% and %/%)
* / multiply, divide
+ - (binary) add, subtract
Related
Can you help me understand how R interprets square brackets with forms such as y[i:j - k]?
dummy data:
y <- c(1, 2, 3, 5, 7, 8)
Here's what I do understand:
y[i] is the ith element of vector y.
y[i:j] is the ith to jth element (inclusive) of vector y.
y[-i] is vector y without the first i elements. etc. etc.
However, what I don't understand is what happens when you start mixing these options, and I haven't found a good resource for explaining it.
For example:
y[1-1:4]
[1] 5 7 8
So y[1-1:4] returns the vector without the first three elements. But why?
and
y[1-4]
[1] 1 2 5 7 8
So y[1-4] returns the vector without the third element. Is that because 1-4 = -3 and it's interpretting it the same as y[-3]? If so, that doesn't seem consistent with my previous example where y[1-1:4] would presumably be interpretted as y[0:4], but that isn't the case.
and
y[1:1+2-1]
[1] 2
Why does this return the second element? I encountered this while I was trying to code something along the lines of: y[i:i + j - k] and it took me a while to figure out that I should write y[i:(i + j - k)] so the parenthesis captured the whole of the right-hand-side of the colon. But I still can't figure out what logic R was doing when I didn't have those brackets.
Thanks!
It's best to look closer at precedence and the integer sequences you use for subsetting. These are evaluated before subsetting with []. Note that - is a function with two arguments (1, 1:4) which are evaluated beforehand and so
> 1-1:4
[1] 0 -1 -2 -3
Negative indices in [] mean exclusion of the corresponding elements. There is no "0" element (and so subsetting at 0 returns an empty vector of the present type -- numeric(0)). We thus expect y[1-1:4] to drop the first three elements in y and return the remainder.
As you write correctly y[1-4] is y[-3], i.e. omission of the third element.
Similar as above, in 1:1+2-1, 1:1 evaluates to a one-element vector 1, the rest is simple arithmetic.
For more on operator precedence, see Hadley's excellent book.
I typed the following in Julia's REPL:
julia> 6÷2(1+2)
1
julia> 6÷2*(1+2)
9
Why are the different results output?
Presh Talwalkar says 9 is correct in the movie
6÷2(1+2) = ? Mathematician Explains The Correct Answer - YouTube
YouTube notwithstanding, there is no correct answer. Which answer you get depends on what precedence convention you use to interpret the problem. Many of these viral "riddles" that go around periodically are contentious precisely because they are intentionally ambiguous. Not a math puzzle really, it's just a parsing problem. It's no deeper than someone saying a sentence with two interpretations. What do you do in that case in real life? You just ask which one they meant. This is no different. For this very reason, the ÷ symbol isn't often used in real mathematical notation—fraction notation is used instead, which clearly disambiguates this as either:
6
- (1 + 2) = 9
2
or as
6
--------- = 1
2 (1 + 2)
Regarding Julia specifically, this precedence behavior is documented here:
https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/#man-numeric-literal-coefficients
Specifically:
The precedence of numeric literal coefficients is slightly lower than that of unary operators such as negation. So -2x is parsed as (-2) * x and √2x is parsed as (√2) * x. However, numeric literal coefficients parse similarly to unary operators when combined with exponentiation. For example 2^3x is parsed as 2^(3x), and 2x^3 is parsed as 2*(x^3).
and the note:
The precedence of numeric literal coefficients used for implicit multiplication is higher than other binary operators such as multiplication (*), and division (/, \, and //). This means, for example, that 1 / 2im equals -0.5im and 6 // 2(2 + 1) equals 1 // 1.
I have three expressions, each involving multiplication with a logical or its negation. These logicals and their negation represent indicator variables, so that the expressions are conditionally evaluated:
-2*3*!T + 5*7*T
5*7*T + -2*3*!T
(-2*3*!T) + 5*7*T
I expect the above to produce the same result. However:
> -2*3*!T + 5*7*T
[1] 0 # unexpected!
> 5*7*T + -2*3*!T
[1] 35
> (-2*3*!T) + 5*7*T
[1] 35
I am sure this has something to do with operator precedence and type coercion, but I can't work out how it makes sense to even evaluate !T after the *.
You're exactly right that this is about operator precedence. As ?base::Syntax (which you link above) states, ! has lower precedence than all of the arithmetic operators, so the first expression is equivalent to
(-2*3)*!(T + 5*7*T)
(because the expression containing ! has to be evaluated before the final multiplication can be done) or
-6*!(36) # T coerced to 1 in numeric operations
or
-6*FALSE # non-zero numbers coerced to TRUE in logical operations
or
-6*0 # FALSE coerced to 0 in numeric operations
This question already has answers here:
magrittr and date objects
(3 answers)
Closed 6 years ago.
I just noticed a strange and interesting bug:
as.numeric((Sys.Date()-30)-Sys.Date())
#[1] -30
Which is correct. But:
library(dplyr)
(Sys.Date()-30)-Sys.Date() %>% as.numeric()
#[1] "1969-12-02"
If the %>% simply feeds the output into the first argument slot, surely this behavior isn't correct?
I've modified your code to make it reproducible for the future:
date <- as.Date("2016-10-18")
as.numeric((date-30)-date)
#[1] -30
(date-30)-date %>% as.numeric()
#[1] "1969-12-02"
You may also noticed that placing parentheses can change these results:
(date-30)-(date %>% as.numeric())
#[1] "1969-12-02"
((date-30)-date) %>% as.numeric()
#[1] -30
The answer is in order of operations as specified on the Syntax help page. It states that:
The following unary and binary operators are defined. They are listed
in precedence groups, from highest to lowest.
:: ::: access variables in a namespace
$ # component / slot extraction
[ [[ indexing
^ exponentiation (right to left)
- + unary minus and plus
: sequence operator
%any% special operators (including %% and %/%)
* / multiply, divide
+ - (binary) add, subtract
Note here that %any% comes before + - (binary). For the difference between unary and binary operators, I recommend the answer to this question.
Suppose I have two custom infix operators in R: %foo% and %bar%.
I have expressions that use both operators, such as:
x %foo% y %bar% z
How can I determine the operator precedence of %foo% and %bar%?
How can I change the precedence so that, for example, %bar% always executes before %foo%? In the example above this would be the same as:
x %foo% (y %bar% z)
I don't think this is explicitly documented, but implicit in the R language documentation is that infix operators are all of equal precedence and so are executed from left to right. This can be demonstrated as follows:
`%foo%` <- `+`
`%bar%` <- `*`
1 %bar% 2 %foo% 3
#5
1 %foo% 2 %bar% 3
#9
The only option I can think of would be to redefine one of the existing operators to do what you wanted. However, that itself would have repercussions so you might want to limit it to within a function.
It's also worth noting that using substitute does not change the operator precedence already set when the expression is first written:
eval(substitute(2 + 2 * 3, list(`+` = `*`, `*` = `+`)))
#10
2 * 2 + 3
#7
How can I determine the operator precedence of %foo% and %bar%?
You can't. R doesn't allow you to set the precedence of custom infix operators. User-defined infix operators have the default precedence rules which means they will be evaluated from left to right.
One reason for this limitation is that it would be extremely difficult and limiting to implement and maintain a set of precendence rules for infix operators. Imagine that you loaded an R package which comes with some custom infix operators. Then the relationship of the infix operators from the package to the %foo% and %bar% which you created would need to be defined. This will quickly become a serious burden.
As an example, imagine that package one contains infix operator %P1IF% and package two contains infix operator %P2IF%. Each package has defined that its infix operator should have the highest precedence. If you were to load both package one and two, then the following expression would be undefined:
v1 %P1IF% v2 %P2IF% v3
(v1 %P1IF% v2) %P2IF% v3 # package 2 doesn't expect this
v1 %P1IF% (v2 %P2IF% v3) # package 1 doesn't expect this
Regardless of what the precedence might be the result for one of the two packages might be incorrect.