Is there a way to change variable assignment names - r

Using R. Is there a way that I can give R any text string and it will treat it like a formula?
An example says it all.
a <- 1
b <- 2
c <- 3
d <- 4
What if I had to do this all the way up to z?
In R we can write:
letters[1]
This gives us an "a"
So what about something like this:
(It doesn't work but I'd like to do something like this)
for (i in 1:4) {
letters[i] <- i
}
There's the as.formula function but that's only good for formulas like a ~ b + c.
Thanks.

If you want to evaluate a text :
eval(parse(text="a<-1"))
But if you want to initialize many variables, you can create a named list and convert it to a separate variables (attach each components to the global environment) using list2env, but I would highly recommend that you keep your variables in the same list.
xx <- letters[1:5]
list2env(setNames(seq_along(xx), xx), .GlobalEnv)

Related

access indexed data in a loop using R

Lets say I have 5 databases, named data1-data5. I basically want to create a loop that prints the first 10 rows of the data. In my naïve mind, the code should look something like this:
for (i in 1:5){
print(head(data[i]))
}
That does not work. What's the proper way to do this? How do I define [i] as the "indexing" variable for the different databases?
Another way would be to use get function:
for (i in 1:5){
tmp <- get(paste0("data", i))
## Assigns the data to the variable tmp - just like tmp <- data1/data2/data3 etc
print(head(tmp))
}
It would be better to put these objects in a list and use [[ to reference them. But if you must use separate names for the objects, then you need to parse them and evaluate the resulting expressions.
Here's an example you can emulate. For brevity, it prints the values of numerical objects rather than the heads of "databases."
data1 <- 1; data2 <- 2; data3 <- 3
for (i in 1:3) {
print(eval(parse(text=paste0("data", i))))
}

How to reference variables from a list when looping over variables using "for"

I am a beginner at R coming from Stata and my first head ache is to figure out how I can loop over a list of names conducting the same operation to all names. The names are variables coming from a data frame. I tried defining a list in this way: mylist<- c("df$name1", "df$name2") and then I tried: for (i in mylist) { i } which I hoped would be equivalent to writing df$name1 and then df$name2 to make R print the content of the variables with the names name1 and name2 from the data frame df. I tried other commands like deleting a variable i=NULL within the for command, but that didn't work either. I would greatly appreciate if someone could tell me what am I doing wrong? I wonder if it has somethign to do with the way I write the i, maybe R does not interpret it to mean the elements of my character vector.
For more clarification I will write out the code I would use for Stata in this instance. Instead of asking Stata to print the content of a variable I am asking it to give summary statistics of a variable i.e. the no. of observations, mean, standard deviation and min and max using the summarize command. In Stata I don't need to refer to the dataframe as I ususally have only one dataset in memory and I need only write:
foreach i in name1 name2 { #name1 and name2 being the names of the variables
summarize `i'
}
So far, I don't manage to do the same thing using the for function in R, which I naivly thought would be:
mylist<-c("df$name1", "df$name2")
for (i in mylist) {
summary(i)
}
you probably just need to print the name to see it. For example, if we have a data frame like this:
df <- data.frame("A" = "a", "B" = "b", "C" = "c")
df
# > A B C
# > 1 a b c
names(df)
# "A" "B" "C"
We can operate on the names using a for loop on the names(df) vector (no need to define a special list).
for (name in names(df)){
print(name)
# your code here
}
R is a little more reticent to let you use strings/locals as code than Stata is. You can do it with functions like eval but in general that's not the ideal way to do it.
In the case of variable names, though, you're in luck, as you can use a string to pull out a variable from a data.frame with [[]]. For example:
df <- data.frame(a = 1:10,
b = 11:20,
c = 21:30)
for (i in c('a','b')) {
print(i)
print(summary(df[[i]]))
}
Notes:
if you want an object printed from inside a for loop you need to use print().
I'm assuming that you're using the summary() function just as an example and so need the loop. But if you really just want a summary of each variable, summary(df) will do them all, or summary(df[,c('a','b')]) to just do a and b. Or check out the stargazer() function in the stargazer package, which has defaults that will feel pretty comfortable for a Stata user.

Using functions to change variable names from upper to lower

I'm working with a bunch of SAS datasets and I prefer the variable names to all be lowercase, using read.sas7bdat, but I do it so often I wanted to write a function. This method works fine,
df <- data.frame(ALLIGATOR=1:4, BLUEBIRD=rnorm(4))
names(file1) <- tolower(names(file1))
but when I try to put it into a function it doesn't assign.
lower <- function (df) {names(df) <- tolower(names(df))}
lower(file1)
I know that there is some larger concept that I'm missing, that is blocking me. It doesn't seem to do anything.
Arguments in R are passed by copy. You have to do:
lower <- function (df) {
names(df) <- tolower(names(df))
df
}
file1 <- lower(file1)
Although I don't see why you would do this rather than simply : names(df) <- tolower(names(df)), I think you should do:
lower <- function (x) {tolower(names(x))}
names(df) <- lower(df)
Here is an answer that I don't recommend using anywhere other than the globalenvironment but it does provide you some convenient shorthand. Basically we take care of the assignment inside the function, overwriting the object passed to it. Short-hand for you, but please be careful about how you use it:
tl <- function(x){
ass <- all.names(match.call()[-1])
assign( ass , setNames( x , tolower(names(x))) , env = sys.frame(sys.parent()) )
}
# This will 'overwrite' the names of df...
tl(df)
# Check what df now looks like...
df
alligator bluebird
1 1 0.2850386
2 2 -0.9570909
3 3 -1.3048907
4 4 -0.9077282

Assigning output of a function to two variables in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
function with multiple outputs
This seems like an easy question, but I can't figure it out and I haven't had luck in the R manuals I've looked at. I want to find dim(x), but I want to assign dim(x)[1] to a and dim(x)[2] to b in a single line.
I've tried [a b] <- dim(x) and c(a, b) <- dim(x), but neither has worked. Is there a one-line way to do this? It seems like a very basic thing that should be easy to handle.
This may not be as simple of a solution as you had wanted, but this gets the job done. It's also a very handy tool in the future, should you need to assign multiple variables at once (and you don't know how many values you have).
Output <- SomeFunction(x)
VariablesList <- letters[1:length(Output)]
for (i in seq(1, length(Output), by = 1)) {
assign(VariablesList[i], Output[i])
}
Loops aren't the most efficient things in R, but I've used this multiple times. I personally find it especially useful when gathering information from a folder with an unknown number of entries.
EDIT: And in this case, Output could be any length (as long as VariablesList is longer).
EDIT #2: Changed up the VariablesList vector to allow for more values, as Liz suggested.
You can also write your own function that will always make a global a and b. But this isn't advisable:
mydim <- function(x) {
out <- dim(x)
a <<- out[1]
b <<- out[2]
}
The "R" way to do this is to output the results as a list or vector just like the built in function does and access them as needed:
out <- dim(x)
out[1]
out[2]
R has excellent list and vector comprehension that many other languages lack and thus doesn't have this multiple assignment feature. Instead it has a rich set of functions to reach into complex data structures without looping constructs.
Doesn't look like there is a way to do this. Really the only way to deal with it is to add a couple of extra lines:
temp <- dim(x)
a <- temp[1]
b <- temp[2]
It depends what is in a and b. If they are just numbers try to return a vector like this:
dim <- function(x,y)
return(c(x,y))
dim(1,2)[1]
# [1] 1
dim(1,2)[2]
# [1] 2
If a and b are something else, you might want to return a list
dim <- function(x,y)
return(list(item1=x:y,item2=(2*x):(2*y)))
dim(1,2)[[1]]
[1] 1 2
dim(1,2)[[2]]
[1] 2 3 4
EDIT:
try this: x <- c(1,2); names(x) <- c("a","b")

R equivalent to the MATLAB structure?

Is there an R type equivalent to the Matlab structure type?
I have a few named vectors and I try to store them in a data frame. Ideally, I would simply access one element of an object and it would return the named vectors (like a structure in Matlab). I feel that using a data frame is not the right thing to do since it can store the values of the named vectors but not the names when they differ from one vector to the other.
More generally, is it possible to store a bunch of different objects in a single one in R?
Edit: As Joran said I think that list does the job.
l = list()
l$vec1 = namedVector1
l$vec2 = namedVector2
...
If I have a list of names
name1 = 'vec1'
name2 = 'vec2'
is there any way for the interpreter to understand that when I use a variable name like name1, I am not referring to the variable name but to its content? I have tried get(name1) but it does not work.
I could still be wrong about what you're trying to do, but I think this is the best you're going to get in terms of accessing each list element by name:
l <- list(a= 1:3,b = 1:10)
> ind <- "a"
> l[[ind]]
[1] 1 2 3
Namely, you're going to have to use [[ explicitly.

Resources