Is there a technical difference between "=" and "<-" [duplicate] - r

This question already has answers here:
What are the differences between "=" and "<-" assignment operators?
(9 answers)
Closed 3 years ago.
I was wondering if there is a technical difference between the assignment operators "=" and "<-" in R. So, does it make any difference if I use:
Example 1: a = 1 or a <- 1
Example 2: a = c(1:20) or a <- c(1:20)
Thanks for your help
Sven

Yes there is. This is what the help page of '=' says:
The operators <- and = assign into the
environment in which they are
evaluated. The operator <- can be used
anywhere, whereas the operator = is
only allowed at the top level (e.g.,
in the complete expression typed at
the command prompt) or as one of the
subexpressions in a braced list of
expressions.
With "can be used" the help file means assigning an object here. In a function call you can't assign an object with = because = means assigning arguments there.
Basically, if you use <- then you assign a variable that you will be able to use in your current environment. For example, consider:
matrix(1,nrow=2)
This just makes a 2 row matrix. Now consider:
matrix(1,nrow<-2)
This also gives you a two row matrix, but now we also have an object called nrow which evaluates to 2! What happened is that in the second use we didn't assign the argument nrow 2, we assigned an object nrow 2 and send that to the second argument of matrix, which happens to be nrow.
Edit:
As for the edited questions. Both are the same. The use of = or <- can cause a lot of discussion as to which one is best. Many style guides advocate <- and I agree with that, but do keep spaces around <- assignments or they can become quite hard to interpret. If you don't use spaces (you should, except on twitter), I prefer =, and never use ->!
But really it doesn't matter what you use as long as you are consistent in your choice. Using = on one line and <- on the next results in very ugly code.

Related

R super assignment vector

I have a function in which I use the superassingment operator to update a variable in the global environment. This works fine as long as it is a single value e.g.
a <<- 3
However I get errors with subsets of data frames and data tables e.g.
a <- c(1,2,3)
a[3] <<- 4
Error in a[3] <<- 4 : object 'a' not found
Any idea why this is and how to solve it?
Thanks!
The superassignment operator and other scope-breaking techniques should be avoided if at all possible, in particular because it makes for unclear code and confusing situations like these. But if you really, truly had to assign values to a variable that is out of scope, you could use standard assignment inside eval:
a <- c(1,2,3)
eval(a[3] <- 4, envir = -1)
a
[1] 1 2 4
To generalize this further (if performing the assignment inside a function), you may need to use <<- inside eval anyway.
While changing variables out of scope is still a bad idea, using eval at least makes the operation more explicit, since you have to specify the environment in which the expression is to be evaluated.
All that said, scope-breaking assignments are never necessary, per se, and you should perhaps find a way to write your script such that this is not relied on.

Understanding the logic of R code [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am learning R through tutorials, but I have difficulties in "how to read" R code, which in turn makes it difficult to write R code. For example:
dir.create(file.path("testdir2","testdir3"), recursive = TRUE)
vs
dup.names <- as.character(data.combined[which(duplicated(as.character(data.combined$name))), "name"])
While I know what these lines of code do, I cannot read or interpret the logic of each line of code. Whether I read left to right or right to left. What strategies should I use when reading/writing R code?
dup.names <- as.character(data.combined[which(duplicated(as.character(data.combined$name))), "name"])
Don't let lines of code like this ruin writing R code for you
I'm going to be honest here. The code is bad. And for many reasons.
Not a lot of people can read a line like this and intuitively know what the output is.
The point is you should not write lines of code that you don't understand. This is not Excel, you do not have but 1 single line to fit everything within. You have a whole deliciously large script, an empty canvas. Use that space to break your code into smaller bits that make a beautiful mosaic piece of art! Let's dive in~
Dissecting the code: Data Frames
Reading a line of code is like looking at a face for familiar features. You can read left to right, middle to out, whatever -- as long as you can lock onto something that is familiar.
Okay you see data.combined. You know (hope) it has rows and columns... because it's data!
You spot a $ in the code and you know it has to be a data.frame. This is because only lists and data.frames (which are really just lists) allow you to subset columns using $ followed by the column name. Subset-by the way- just means looking at a portion of the overall. In R, subsetting for data.frames and matrices can be done using single brackets[, within which you will see [row, column]. Thus if we type data.combined[1,2], it would give you the value in row 1 of column 2.
Now, if you knew that the name of column 2 was name you can use data.combined[1,"name"] to get the same output as data.combined$name[1]. Look back at that code:
dup.names <- as.character(data.combined[which(duplicated(as.character(data.combined$name))), "name"])
Okay, so now we see our eyes should be locked on data.combined[SOMETHING IS IN HERE?!]) and slowly be picking out data.combined[ ?ROW? , Oh the "name" column]. Cool.
Finding those ROW values!
which(duplicated(as.character(data.combined$name)))
Anytime you see the which function, it is just giving you locations. An example: For the logical vector a = c(1,2,2,1), which(a == 1) would give you 1 and 4, the location of 1s in a.
Now duplicated is simple too. duplicated(a) (which is just duplicated(c(1,2,2,1))) will give you back FALSE FALSE TRUE TRUE. If we ran which(duplicated(a)) it would return 3 and 4. Now here is a secret you will learn. If you have TRUES and FALSES, you don't need to use the which function! So maybe which was unnessary here. And also as.character... since duplicated works on numbers and strings.
What You Should Be Writing
Who am I to tell you how to write code? But here's my take.
Don't mix up ways of subsetting: use EITHER data.frame[,column] or data.frame$column...
The code could have been written a little bit more legibly as:
dupes <- duplicated(data.combined$name)
dupe.names <- data.combines$name[dupes]
or equally:
dupes <- duplicated(data.combined[,"name"])
dupe.names <- data.combined[dupes,"name"]
I know this was lengthy but I hope it helps.
An easier way to read any code is to break up their components.
dup.names <-
as.character(
data.combined[which(
duplicated(
as.character(
data.combined$name
)
)
), "name"]
)
For each of the functions - those parts with rounded brackets following them e.g. as.character() you can learn more about what they do and how they work by typing ?as.character in the console
Square brackets [] are use to subset data frames, which are stored in your environment (the box to the upper right if you're using R within RStudio contains your values as well as any defined functions). In this case, you can tell that data.combined is the name that has been given to such a data frame in this example (type ?data.frame to find out more about data frames).
"Unwrapping" long lines of code can be daunting at first. Start by breaking it down into parenthesis , brackets, and commas. Parenthesis directly tacked onto a word indicate a function, and any commas that lie within them (unless they are part of another nested function or bracket) separate arguments which contain parameters that modify the way the function behaves. We can reduce your 2nd line to an outer function as.character and its arguments:
dup.names <- as.character(argument_1)
Just from this, we know that dup.names will be assigned a value with the data type "character" off of a single argument.
Two functions in the first line, file.path() and dir.create(), contain a comma to denote two arguments. Arguments can either be a single value or specified with an equal sign. In this case, the output of file.path happens to perform as argument #1 of dir.create().
file.path(argument_1,argument_2)
dir.create(argument_1,argument_2)
Brackets are a way of subsetting data frames, with the general notation of dataframe_object[row,column]. Within your second line is a dataframe object, data.combined. You know it's a dataframe object because of the brackets directly tacked onto it, and knowing this allows you to that any functions internal to this are contributing to subsetting this data frame.
data.combined[row, column]
So from there, we can see that the internal functions within this bracket will produce an output that specifies the rows of data.combined that will contribute to the subset, and that only columns with name "name" will be selected.
Use the help function to start to unpack these lines by discovering what each function does, and what it's arguments are.

variable-assignment: Difference between "<-" and "=" in a certain post & avoiding using "return" [duplicate]

This question already has answers here:
What are the differences between "=" and "<-" assignment operators?
(9 answers)
Closed 6 years ago.
I have found the following solution on here about the post:
https://stackoverflow.com/a/34327262/2994949
The user eipi10 uses = insted of <- to assign a value to the corrFunc function. Why does he do this?
Also, he/she creates the data.frame in the next line, but does not use a return to have that data.frame returned from the code. The function works, so I wonder why and how.
EDIT
Does it provide any advantages to used or not to use the returncommand? This is something that has not been answered before, that's why I think this is not a duplicate.
I tried to ask this in a comment, but I need 50 reputation to put comments and why I put an answer in the initial thread, it was immediately deleted. Could anybody tell me, how to ask about a solution I find in a thread when I can not comment and can not post an answer?
Thank you.
EDIT
The first part of my question has been answered partly by the link but I still do not understand why the return is avoided. thanks :)
From ?return:
If the end of a function is reached without calling return, the value of the last evaluated expression is returned.
For example,
f <- function() {
x <- 1
x
}
is equivalent to the same function with return(x) as the last statement. Perhaps surprisingly,
f <- function() {
x <- 1
}
also returns the same value, but returns it invisibly. There is a minor schism (perhaps not quite as strong as the = vs. <- schism) about whether it's better practice to always use an explicit return(): I believe it is good practice (because it makes the intention of the code more explicit), but many old-school R programmers prefer the implicit return value.

Why does coercing a column in a data.table with by not work while coercing without does work without warning? [duplicate]

This question already has an answer here:
How to change type of target column when doing := by group in a data.table in R?
(1 answer)
Closed 5 years ago.
Below I am doing the same operation in two ways. The first does not work, while the second does work. I am wondering why? I have not been able to find an answer to this question in data.table documentation or other places through google.
SOtable <- data.table(testInt=c(1:100))
SOtable[,testInt := as.double(testInt), by=1:nrow(SOtable)]
##Error in `[.data.table`(SOtable, , `:=`(testInt, as.double(testInt)), :
## Type of RHS ('double') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)
SOtable[,testInt := as.double(testInt)]
The reason to try this is because I wanted to do some manipulation on a column in a big data.table for each row, but as soon as I use by I get the LHS/RHS error. But as I am typing this I am thinking: "Maybe I should have used some apply function for this instead?"
Answer by #Roland:
In the first example you replace an entire column so the variable type is not a factor.
When using by each value gets written into the column when calculated so if the types differ it will try to write (in this case) a double variable into a integer column, which is not going to work.

Variable part in an expression( ... ) with R [duplicate]

This question already has answers here:
Use a variable within a plotmath expression
(3 answers)
Closed 9 years ago.
I would like to display physical units in an R plot. In order to have a better typography, I use the expression function this way:
plot(rnorm(10),rnorm(10),main=expression(µg.L^-1))
Suppose now that the unit is not statically known, and is given by a variable [unit]:
unit = 'µg.L^-1'
plot(rnorm(10),rnorm(10),main=expression(unit))
This of course does not work because [unit] is not substituted by its value. Is there some means to achieve that anyway?
Edit:
I should stress that the main difficulty here is that the unit to be displayed is sent as a string to my plot function. So the contents of unit should be interpreted as an expression at some point (that is transformed from a string to an expression object), and this is where the answer by texb comes handy. So please unmark this question as duplicate, since the use of parse is fundamental here and is not even mentionned in the post you suggest.
How about:
unit = 'µg.L^-1'
plot(rnorm(10),rnorm(10),main=parse(text=unit))
The bquote function gives you flexibility in creating expressions while inserting values from variables. Here is one example:
unit <- as.name('mu')
plot(rnorm(10), main=bquote( .(unit)*.L^-1 ) )
I think both answers are helpful but would like to suggest a more complete use of the plotmath syntax. The answer you accepted at the moment doesn't really parse the Greek mu separately and Greg Snow's answer doesn't illustrate how expressions can be used as values (but it does show how to substitute within expressions). So this is another alternative that also shows using a plotmath cdot operator as the separating "dot" which I think better addresses your interest in typography.
plot(1,1, main=expression(mu*g %.% L^-1) )
It's also possible to create fully formed expression and save by name:
micgmperL = expression(mu*g %.% L^-1)
plot(1,1, main=micgmperL)

Resources