I would like to loop over a string variable. For example:
clist <- c("BMI", "trig", "hdl")
for (i in clist) {
data_FK_i<-subset(data_FK, subset= !is.na(FK) & (!is.na(i)))
}
The "i" should receive a different name from the list.
What am I doing wrong? It's not working? Adding "" doesn't seem to help.
Thank,
Einat
Thanks, the "assign" answer did the work!!!!!!!!!!
I agree with #Thomas. You should use a list. However, let me demonstrate how to modify your code to create multiple objects. You can use the function assign to create objects based on strings.
clist <- c("BMI", "trig", "hdl")
for (i in clist) {
assign(paste0("data_FK_", i), complete.cases(data[c("FK", i)]))
}
Try something like this instead, which will give you a list containing the three subsetted dataframes:
lapply(clist, function(x) data_FK[ !is.na(data_FK$FK) & !is.na(data_FK[,x]) ,])
The problem in your code is that i is a character string, specifically one of the values from clist in each iteration of the for-loop. So, when R reads !is.na(i) you're saying !is.na("BMI"), etc.
Various places on Stack Overflow advise against using subset at all in favor of extraction indices (i.e., [) like in the example code above because subset relies on non-standard evaluation that is confusing and sometimes leads you down bad rabbit holes.
Is this what you want?
You need to give the loop something to store the data into.
Also you need to tell the loop how long you want it to run.
clist <- c("BMI", "trig", "hdl")
#empty vector
data_FK<-c()
#I want a loop and it will 'loop' 3 times (1 to 3), which is the length of my list
for (i in 1:length(clist)) {
#each loop stores the corresponding item from the list into the vector
data_FK<-c(data_FK,clist[i])
}
## or if you want to store the values in a data frame
## there are other ways to create this, but here is a simple solution
data_FK<-data.frame(placer=1:length(clist))
for(i in 1:length(clist)){
data_FK$items[i]<-clist[i]
}
## or maybe you just want to print the names
for (i in 1:length(clist)){
print(clist[i])
}
Related
Lets say I have 5 databases, named data1-data5. I basically want to create a loop that prints the first 10 rows of the data. In my naïve mind, the code should look something like this:
for (i in 1:5){
print(head(data[i]))
}
That does not work. What's the proper way to do this? How do I define [i] as the "indexing" variable for the different databases?
Another way would be to use get function:
for (i in 1:5){
tmp <- get(paste0("data", i))
## Assigns the data to the variable tmp - just like tmp <- data1/data2/data3 etc
print(head(tmp))
}
It would be better to put these objects in a list and use [[ to reference them. But if you must use separate names for the objects, then you need to parse them and evaluate the resulting expressions.
Here's an example you can emulate. For brevity, it prints the values of numerical objects rather than the heads of "databases."
data1 <- 1; data2 <- 2; data3 <- 3
for (i in 1:3) {
print(eval(parse(text=paste0("data", i))))
}
I need to subset a data frame in several others based in the values of several columns of the original data frame.
Here's my for loop:
for (i in 1:qtde_erros_esti){
temp_esti <- erro_esti[(paste0("erro_esti$" , "erro", i) == "1"),]
assign(paste0("erro", i,"_esti"), temp_esti)
rm(temp_esti)
}
The last piece of the puzzle for me is to pass the column name which value I must check (1st line in the for loop).
I'm trying to pass it with the function paste0, but the result of the function is a string that will never be equal to "1", hence never getting any data.
How can I pass the column names (erro_esti$erro1, erro_esti$erro2, and so on...) in this case?
Observation: I'm aware that this may not be the best approach using R, but I'm a noobie, coming from SAS, so I have limited knowledge.
Secondary question: is the way that I formulated the question (topic title) good? Accepting criticism on that too, please, aiming to improve future questions.
Thanks in advance for anyone who take some time to read this.
We can use [[ instead of $ to subset the column dynamically
erro_esti[[paste0("erro", i)]]
-full code
for(i in seq_len(qtde_erros_esti)) {
temp_esti <- erro_esti[erro_esti[[paste0("erro", i)]] == 1,]
assign(paste0("erro", i,"_esti"), temp_esti)
rm(temp_esti)
}
You are probably going about things a bit too complicated most likely, considert his approach:
for (i in 1:qtde_erros_esti){
column.name <- paste0("erro", i)
column.data <- erro_esti[, column.name ]
## do things with the column.data vector here
}
Now you can do what needs to be done with the data from column i, using the column.data variable.
If you just want to work with every column of your data.frame, also consider this further simplified pattern:
for( column.data in erro_esti ) {
## work with column.data here
}
You can just iterate over the columns of erro_esti directly, no need to use a counter, unless you need that counter for something else.
I have written a loop in R. The code is expected to go through a list of variables defined in a list and then for each of the variables perform a function.
Problem 1 - I cannot loop through the list of variables
Problem 2 - I need to insert each output from the values into Mongo DB
Here is an example of the list:
121715771201463_626656620831011
121715771201463_1149346125105084
Based on this value - I am running a code and i want this output to be inserted into MongoDB. Right now only the first value and its corresponding output is inserted
test_list <-
C("121715771201463_626656620831011","121715771201463_1149346125105084","121715771201463_1149346125105999")
for (i in test_list)
{ //myfunction//
mongo.insert(mongo, DBNS, i)
}
I am able to only pick the values for the first value and not all from the list
Any help is appreciated.
Try this example, which prints the final characters
myfunction <- function(x){ print( substr(x, 27, nchar(x)) ) }
test_list <- c("121715771201463_626656620831011",
"121715771201463_1149346125105084",
"121715771201463_1149346125105999")
for (i in test_list){ myfunction(i) }
for (j in 1:length(test_list)){ myfunction(test_list[j]) }
The final two lines should each produce
[1] "31011"
[1] "105084"
[1] "105999"
It is not clear whether "variable" is the same as "value" here.
If what you mean by variable is actually an element in the list you construct, then I think Ilyas comment above may solve the issue.
If "variable" is instead an object in the workspace, and elements in the list are the names of the objects you want to process, then you need to make sure that you use get. Like this:
for(i in ls()){
cat(paste(mode(get(i)),"\n") )
}
ls() returns a list of names of objects. The loop above goes through them all, uses get on them to get the proper object. From there, you can do the processing you want to do (in the example above, I just printed the mode of the object).
Hope this helps somehow.
I am implementing k-means in R.
In a loop, I am initiating several vectors that will be used to store values that belong to a particular cluster, as seen here:
for(i in 1:k){
assign(paste("cluster",i,sep=""),vector())
}
I then want to add to a particular "cluster" vector, depending on the value I get for the variable getIndex. So if getIndex is equal to 2 I want to add the variable minimumDistance to the vector called cluster2. This is what I am attempting to do:
minimumDistance <- min(distanceList)
getIndex <- match(minimumDistance,distanceList)
clusterName <- paste("cluster",getIndex,sep="")
name <- c(name, minimumDistance)
But obviously the above code does not work because in order to append to a vector that I'm naming I need to use assign as I do when I instantiate the vectors. But I do not know how to use assign, when using paste, when also appending to a vector.
I cannot use the index such as vector[i] because I don't know what index of that particular vector I want to add to.
I need to use the vector <- c(vector,newItem) format but I do not know how to do this in R. Or if there is any other option I would greatly, greatly appreciate it. If I were using Python I would simply use paste and then use append but I can't do that in R. Thank you in advance for your help!
You can do something like this:
out <- list()
for (i in 1:nclust) {
# assign some data (in this case a list) to a cluster
assign(paste0("N_", i), list(...))
# here I put all the clusters data in a list
# but you could use a similar statement to do further data manipulation
# ie if you've used a common syntax (here "N_" <index>) to refer to your elements
# you can use get to retrieve them using the same syntax
out[[i]] <- get(paste0("N_", i))
}
If you want a more complete code example, this link sounds like a similar problem emclustr::em_clust_mvn
I am new to R and it seems like this shouldn't be a difficult task but I cannot seem to find the answer I am looking for. I am trying to add multiple vectors to a data frame using a for loop. This is what I have so far and it works as far as adding the correct columns but the variable names are not right. I was able to fix them by using rename.vars but was wondering if there was a way without doing that.
for (i in 1:5) {
if (i==1) {
alldata<-data.frame(IA, rand1) }
else {
alldata<-data.frame(alldata, rand[[i]]) }
}
Instead of the variable names being rand2, rand3, rand4, rand5, they show up as rand..i.., rand..i...1, rand..i...2, and rand..i...3.
Any Suggestions?
You can set variable names using the colnames function. Therefore, your code would look something like:
newdat <- cbind(IA, rand1, rand[2:5])
colnames(newdat) <- c(colnames(IA), paste0("rand", 1:5))
If you're creating your variables in a loop, you can assign the names during the loop
alldata <- data.frame(IA)
for (i in 1:5) {alldata[, paste0('rand', i)] <- rand[[i]]}
However, R is really slow at loops, so if you are trying to do this with tens of thousands of columns, the cbind and rename approach will be much faster.
Just do cbind(IA, rand1, rand[2:5]).