This question already has answers here:
What are the differences between "=" and "<-" assignment operators?
(9 answers)
Closed 3 years ago.
Is it just a style preference?
As far as I can tell, they are the same.
I see many people prefer the "longer" <- version and I can't tell why (perhaps keeping away from = and == confusions?)
No, they are not exactly the same: the = operator cannot be used everywhere that <- can.
The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.
There are also differences in scope. See this answer for more details.
Which is better depends on who you ask.
Reading from "Introducing Monte Carlo Methods with R", by Robert and Casella:
"The assignment operator is =, not to be confused with ==, which is the Boolean operator for equality. An older assignment operator is <- and, for compatibility reasons, it still remains functional, but it should be ignored to ensure cleaner programming. (As pointed out by Spector, P. (2009). 'Data Manipulation with R' - Section 8.7., an exception is when using system.time, since = is then used to identify keywords)
Source
On the other hand, Google's R style guide recommends using <-:
Assignment
Use <-, not =, for assignment.
GOOD:
x <- 5
BAD:
x = 5
Related
[ and [[ come up a lot when using R. Suppose that I'm having a conversation about these two functions, what do I actually call these "indexing operators"? I know how to name them as punctuation, but is there anything within R or its documentation that gives them a more specific name? I know that they're subsetting functions that are documented under ?Extract, but I've never seen anyone call them anything like "extract and double extract".
In the R Language Definition they are referred to as "single and double brackets":
Indexing of arrays and vectors is performed using the single and double brackets, [] and [[]]
In An Introduction to R you find several instances of "square bracket" and "double square bracket".
This agrees with the general terminology (see e.g. wikipedia on square brackets and double brackets, and their Unicode).
Consider the following R code:
y1 <- dataset %>% dplyr::filter(W == 1)
This works, but there seems to some magic here. Usually, when we have an expression like foo(bar), we should be able to do this:
baz <= bar
foo(baz)
However, in the presented code snippet, we cannot evaluate W == 1 outside of dplyr::filter()! W is not a defined variable.
What's going on?
dplyr uses a concept called Non-standard Evaluation (NSE) to make columns from the data frame argument accessible to its functions without quoting or using dataframe$column syntax. Basically:
[Non-standard evaluation] is a catch-all term that means they don’t follow the usual R rules of evaluation. Instead, they capture the expression that you typed and evaluate it in a custom way.1
In this case, the custom evaluation takes the argument(s) given to dplyr::filter, and parses them so that W can be used to refer to the dataset$W. The reason that you can't then take that variable and use it elsewhere is that NSE is only applied to the scope of the function.
NSE makes a trade-off: functions which modify scope are less safe and/or unusable in programming where you're building a program that uses functions to modify other functions:
This is an example of the general tension between functions that are designed for interactive use and functions that are safe to program with. A function that uses substitute() might reduce typing, but it can be difficult to call from another function.2
For example, if you wanted to write a function which would use the same code, but swap out W == 1 for W == 0 (or some completely different filter), NSE would make that more difficult to accomplish.
In 2017 the tidyverse started to build a solution to this in tidy evaluation.
My question is where does the piping operator of magrittr package %>% come in the order of operations?
I have a problem simmilar to the following:
set.seed(10)
df <- data.frame(a=rnorm(3),b=rnorm(3),c=rnorm(3))
df/rowSums(df) %>% round(.,3)
This results in the following non rounded figures:
a b c
1 -0.0121966 0.119878 0.8922125
To get the rounded figures I need to put df/rowSums(df) between brackets.
I experimented with the +,-,*,/ and ^ and from the results I found the order of operation is as follow:
Exponents
Piping
Multiplication and division
Addition and subtraction
Is that right or there is something wrong with my understanding of the piping operator?
The help page you are looking for is ?Syntax. (Don't feel bad for not being able to find this, it took me about six guesses at search keywords.) I'm going to quote its entire operator precedence table here:
The following unary and binary operators are defined. They are
listed in precedence groups, from highest to lowest.
‘:: :::’ access variables in a namespace
‘$ #’ component / slot extraction
‘[ [[’ indexing
‘^’ exponentiation (right to left)
‘- +’ unary minus and plus
‘:’ sequence operator
‘%any%’ special operators (including ‘%%’ and ‘%/%’)
‘* /’ multiply, divide
‘+ -’ (binary) add, subtract
‘< > <= >= == !=’ ordering and comparison
‘!’ negation
‘& &&’ and
‘| ||’ or
‘~’ as in formulae
‘-> ->>’ rightwards assignment
‘<- <<-’ assignment (right to left)
‘=’ assignment (right to left)
‘?’ help (unary and binary)
So magrittr's pipe operators, like all the operators of the form %whatever%, do indeed have precedence greater than multiply and divide but lower than exponentiation, and this is guaranteed by the language specification.
Personally, I don't see the value in these operators. Why not just write
round(df/rowSums(df), 3)
which has the evaluation order you want, and is (IMNSHO) easier to read as well?
My question is where does the piping operator of magrittr package %>% come in the order of operations?
I have a problem simmilar to the following:
set.seed(10)
df <- data.frame(a=rnorm(3),b=rnorm(3),c=rnorm(3))
df/rowSums(df) %>% round(.,3)
This results in the following non rounded figures:
a b c
1 -0.0121966 0.119878 0.8922125
To get the rounded figures I need to put df/rowSums(df) between brackets.
I experimented with the +,-,*,/ and ^ and from the results I found the order of operation is as follow:
Exponents
Piping
Multiplication and division
Addition and subtraction
Is that right or there is something wrong with my understanding of the piping operator?
The help page you are looking for is ?Syntax. (Don't feel bad for not being able to find this, it took me about six guesses at search keywords.) I'm going to quote its entire operator precedence table here:
The following unary and binary operators are defined. They are
listed in precedence groups, from highest to lowest.
‘:: :::’ access variables in a namespace
‘$ #’ component / slot extraction
‘[ [[’ indexing
‘^’ exponentiation (right to left)
‘- +’ unary minus and plus
‘:’ sequence operator
‘%any%’ special operators (including ‘%%’ and ‘%/%’)
‘* /’ multiply, divide
‘+ -’ (binary) add, subtract
‘< > <= >= == !=’ ordering and comparison
‘!’ negation
‘& &&’ and
‘| ||’ or
‘~’ as in formulae
‘-> ->>’ rightwards assignment
‘<- <<-’ assignment (right to left)
‘=’ assignment (right to left)
‘?’ help (unary and binary)
So magrittr's pipe operators, like all the operators of the form %whatever%, do indeed have precedence greater than multiply and divide but lower than exponentiation, and this is guaranteed by the language specification.
Personally, I don't see the value in these operators. Why not just write
round(df/rowSums(df), 3)
which has the evaluation order you want, and is (IMNSHO) easier to read as well?
This question already has answers here:
What are the differences between "=" and "<-" assignment operators?
(9 answers)
Closed 3 years ago.
I'm using R 2.8.1 and it is possible to use both = and <- as variable assignment operators. What's the difference between them? Which one should I use?
From here:
The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.
Reading from "Introducing Monte Carlo Methods with R", by Robert and Casella:
"The assignment operator is =, not to be confused with ==, which is the Boolean operator for equality. An older assignment operator is <- and, for compatibility reasons, it still remains functional, but it should be ignored to ensure cleaner programming.
(As pointed out by Spector, P. (2009). 'Data Manipulation with R' - Section 8.7., an exception is when using system.time, since = is then used to identify keywords)
A misleading feature of the assignment operator <- is found in Boolean
expressions such as
> if (x[1]<-2) ...
which is supposed to test whether or not x[1] is less than -2 but ends
up allocating 2 to x[1], erasing its current value! Note also that using
> if (x[1]=-2) ...
mistakenly instead of (x[1]==-2) has the same consequence."