I would like to write an expression in R where I can modify some part of it depending on some other variable. For example, let's say that I want to create a lot of new columns in my data frame using the following basic expression:
MyDataFrame$ColumnNumber1 <- 33
If I wanted to create a lot of these, I could just type them out explicitly
MyDataFrame$ColumnNumber2 <- 52
MyDataFrame$ColumnNumber3 <- 12
...
and so on. However, this would be impractical if I have to many of them. Therefore, I'd like to have some way of replacing a part of the expression with something that's generated through a variable. Let's for example say that there was an operator Input() that would do this. Then I could write a function looking like this:
for(i in 1:1000)
{
MyDataFrame$ColumnNumberInput(i) <- 32*i
}
where the Input(i) part would be replaced with whatever number the counter were at at the moment, which in turn would generate an expression.
I now that I can do this by using:
eval(parse(text=paste("MyDataFrame$","Column","Number","1","<- 32*i",sep="")))
but this gets impossible to use if the expression is too complicated, long, or have other things like this nested inside of it.
Use [[]]:
for(i in 1:1000)
{
MyDataFrame[[i]] <- 32*i
}
[[]] is the programmatic equivalent of $.
You may then reference the data structure as:
MyDataFrame[[14]] <- MyDataObjFourteen
MyDataFrame$14
MyDataFrame[[14]]
You could also use strings, like so:
MyDataFrame[["SomeString"]] <- 32
MyDataFrame$SomeString
# 32
EDIT 1:
To do something similar, in a more general way try:
assign("someArbitraryString", 3)
someArbitraryString
# 3
as.name(someArbitraryString)
Related
I am coding this in r and solved this in an alternative way to make the vector to a list and assign value to each of the element of the list, but is there any other direct simple approach?
for(i in 1:5){
paste('var',i,sep='')=i
}
i want output where 1:5 will assign like
var1=1
var2=2
var3=3
var4=4
var5=5
Don’t do this. Use a vector or list instead:
var = 1 : 5
Now you can use var[1] (instead of var1) etc.
Your code doesn’t work because paste creates a character vector, not a variable name.
I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.
I want to be able to use a loop to perform the same funtion on a group of data sets without having to recall the name of all of the data sets individually. For example, say I have the following matricies:
a<-matrix(1:5,nrow=5,ncol=2)
b<-matrix(6:10,nrow=5,ncol=2)
c<-matrix(11:15,nrow=5,ncol=2)
I define a vector of set names:
SetNames<- c("a","b","c")
Then I want to sum the second column of all of the matricies without having to call each matrix name. Basically, I would like to be able to call SetNames[1], have the program return 'a' as USEABLE text which can be used to call apply(a[2],2,sum).
If apply(SetNames[1][2],2,sum) worked, that would be the basic syntax I was looking for, however I would replace the 1 with a variable I can increase in a loop.
sapply can do that.
sapply(SetNames, function(z) {
dfz <- get(z)
sum(dfz[,2])
})
# a b c
# 15 40 65
Notice that get() is used here to dynamically access a variable.
a less compact way of writing this would be
sumRowTwo <- function(z) {
dfz <- get(z)
sum(dfz[,2])
}
sapply(SetNames, sumRowTwo)
and now you can play around with sumRowTwo and see what e.g.
sumRowTwo("a")
returns
I am implementing k-means in R.
In a loop, I am initiating several vectors that will be used to store values that belong to a particular cluster, as seen here:
for(i in 1:k){
assign(paste("cluster",i,sep=""),vector())
}
I then want to add to a particular "cluster" vector, depending on the value I get for the variable getIndex. So if getIndex is equal to 2 I want to add the variable minimumDistance to the vector called cluster2. This is what I am attempting to do:
minimumDistance <- min(distanceList)
getIndex <- match(minimumDistance,distanceList)
clusterName <- paste("cluster",getIndex,sep="")
name <- c(name, minimumDistance)
But obviously the above code does not work because in order to append to a vector that I'm naming I need to use assign as I do when I instantiate the vectors. But I do not know how to use assign, when using paste, when also appending to a vector.
I cannot use the index such as vector[i] because I don't know what index of that particular vector I want to add to.
I need to use the vector <- c(vector,newItem) format but I do not know how to do this in R. Or if there is any other option I would greatly, greatly appreciate it. If I were using Python I would simply use paste and then use append but I can't do that in R. Thank you in advance for your help!
You can do something like this:
out <- list()
for (i in 1:nclust) {
# assign some data (in this case a list) to a cluster
assign(paste0("N_", i), list(...))
# here I put all the clusters data in a list
# but you could use a similar statement to do further data manipulation
# ie if you've used a common syntax (here "N_" <index>) to refer to your elements
# you can use get to retrieve them using the same syntax
out[[i]] <- get(paste0("N_", i))
}
If you want a more complete code example, this link sounds like a similar problem emclustr::em_clust_mvn
I would like to form a list thanks to a loop.
I have a list of variables called:
var1, var2, ... varN
And I would like to create easily a list of length named listvar with:
unlist(listvar[i])=vari (with i in 1:N)
Is someone inspired ?
The code makes me wonder why the variables var1 … varN exist in the first place: they shouldn’t. Instead, generate the list directly.
That said, you can easily retrieve the value of a variable given by its name using get. This doesn’t even require a loop, you can use R’s vectorised operations.
varnames = paste0('var', 1 : N)
listvar = mget(varnames)