I am quite new in R and I know it is very simple but i got stuck.
Could you please tell me how I can write an Excel formula ="X" & i (for i for instance from 1 to 10) used in loop in r.
For example, assume I have two dataframes with a single column "SUBSET1" and "SUBSET2". What I want is to save the result of the sum of each column in two different dataframes.
For an reproducible example please refer below to the EDIT part:
Illustration:
for (i in 1:2)
{
assign(paste0("sum_results", i),"")
}
for (i in 1:2)
{
sum_results & i<-sum(subset & i) ----something which works in this way
}
I would be very grateful for any hint.
EDIT: Proper example:
Let's assume I have the following data frames
a<-c(2,3,4)
b<-c(2,3,5)
subset1<-data.frame(a,b)
a<-c(2,7,5)
b<-c(4,8,15)
subset2<-data.frame(a,b)
So desired output is that I have two data frames: sum_results1 & sum_results2, where sum_results1
is the sum of the column "a" of the subset1, and sum_results2 is the sum of the column "a" of the subset2.
for (i in 1:2)
{
assign(paste0("sum_results", i),"")
}
for (i in 1:2)
{
sum_results & i<-sum(subset & i)$a --that is where the problem is
}
you were very close. Assuming I am understanding your question correctly, try this:
for (i in 1:2)
{
assign(paste0("sum_results", i),sum(get(paste0("subset",i))))
}
Generally, you want to avoid loops in R. See the comments to your question regarding lapply There are probably much more efficient ways to solving this question. But you have not provided a replicable example as also mentioned in your comments. But let me know if this helps!
EDIT:: below is how you would use sapply and then my solution above to rename your results. sapply will allow you to use a more complicated function that could potentially do things with more than one column. You will have to be specific.
N <- 2
res <- sapply(1:N, function(i) sum(get(paste0("subset",i))))
for (i in 1:N)
{
assign(paste0("sum_results", i),res[i])
}
Related
I need to subset a data frame in several others based in the values of several columns of the original data frame.
Here's my for loop:
for (i in 1:qtde_erros_esti){
temp_esti <- erro_esti[(paste0("erro_esti$" , "erro", i) == "1"),]
assign(paste0("erro", i,"_esti"), temp_esti)
rm(temp_esti)
}
The last piece of the puzzle for me is to pass the column name which value I must check (1st line in the for loop).
I'm trying to pass it with the function paste0, but the result of the function is a string that will never be equal to "1", hence never getting any data.
How can I pass the column names (erro_esti$erro1, erro_esti$erro2, and so on...) in this case?
Observation: I'm aware that this may not be the best approach using R, but I'm a noobie, coming from SAS, so I have limited knowledge.
Secondary question: is the way that I formulated the question (topic title) good? Accepting criticism on that too, please, aiming to improve future questions.
Thanks in advance for anyone who take some time to read this.
We can use [[ instead of $ to subset the column dynamically
erro_esti[[paste0("erro", i)]]
-full code
for(i in seq_len(qtde_erros_esti)) {
temp_esti <- erro_esti[erro_esti[[paste0("erro", i)]] == 1,]
assign(paste0("erro", i,"_esti"), temp_esti)
rm(temp_esti)
}
You are probably going about things a bit too complicated most likely, considert his approach:
for (i in 1:qtde_erros_esti){
column.name <- paste0("erro", i)
column.data <- erro_esti[, column.name ]
## do things with the column.data vector here
}
Now you can do what needs to be done with the data from column i, using the column.data variable.
If you just want to work with every column of your data.frame, also consider this further simplified pattern:
for( column.data in erro_esti ) {
## work with column.data here
}
You can just iterate over the columns of erro_esti directly, no need to use a counter, unless you need that counter for something else.
I'm sure someone has asked this (very basic) question before, but I must be searching for the wrong thing because I can't find an answer:
I frequently need to perform operations that involve combining data from multiple rows of the same dataframe. I know how to do this with a looping construct, e.g.
for (i in 2:nrow(df)) { df$result[i] <- df$data[i] - df$data[i-1] }
for (i in 12:nrow(df)) { j <- i - 11; df$result[i] <- prod(df$data[j:i]) }
Is there a general solution for these types of operations that does not involve looping? Or is looping actually the best way to do it in R?
You may try subsetting your data frame, e.g. this:
for (i in 2:nrow[df]) { df$result[i] <- df$data[i] - df$data[i-1] }
becomes:
df$result[2:nrow(df)] <- df$data[2:nrow(df)] - df$data[1:nrow(df)-1]
Note: nrow() is a function AFAIK, so you should call it using parentheses, not square brackets.
In base R:
df$result[2:nrow(df)] = diff(df$data)
df$result2[13:nrow(df)] = diff(df$data,12)
Or dplyr:
df$result = dplyr::lag(df$data)
df$result2 = dplyr::lag(df$data, 12)
I would like to loop over a string variable. For example:
clist <- c("BMI", "trig", "hdl")
for (i in clist) {
data_FK_i<-subset(data_FK, subset= !is.na(FK) & (!is.na(i)))
}
The "i" should receive a different name from the list.
What am I doing wrong? It's not working? Adding "" doesn't seem to help.
Thank,
Einat
Thanks, the "assign" answer did the work!!!!!!!!!!
I agree with #Thomas. You should use a list. However, let me demonstrate how to modify your code to create multiple objects. You can use the function assign to create objects based on strings.
clist <- c("BMI", "trig", "hdl")
for (i in clist) {
assign(paste0("data_FK_", i), complete.cases(data[c("FK", i)]))
}
Try something like this instead, which will give you a list containing the three subsetted dataframes:
lapply(clist, function(x) data_FK[ !is.na(data_FK$FK) & !is.na(data_FK[,x]) ,])
The problem in your code is that i is a character string, specifically one of the values from clist in each iteration of the for-loop. So, when R reads !is.na(i) you're saying !is.na("BMI"), etc.
Various places on Stack Overflow advise against using subset at all in favor of extraction indices (i.e., [) like in the example code above because subset relies on non-standard evaluation that is confusing and sometimes leads you down bad rabbit holes.
Is this what you want?
You need to give the loop something to store the data into.
Also you need to tell the loop how long you want it to run.
clist <- c("BMI", "trig", "hdl")
#empty vector
data_FK<-c()
#I want a loop and it will 'loop' 3 times (1 to 3), which is the length of my list
for (i in 1:length(clist)) {
#each loop stores the corresponding item from the list into the vector
data_FK<-c(data_FK,clist[i])
}
## or if you want to store the values in a data frame
## there are other ways to create this, but here is a simple solution
data_FK<-data.frame(placer=1:length(clist))
for(i in 1:length(clist)){
data_FK$items[i]<-clist[i]
}
## or maybe you just want to print the names
for (i in 1:length(clist)){
print(clist[i])
}
I am new to R and it seems like this shouldn't be a difficult task but I cannot seem to find the answer I am looking for. I am trying to add multiple vectors to a data frame using a for loop. This is what I have so far and it works as far as adding the correct columns but the variable names are not right. I was able to fix them by using rename.vars but was wondering if there was a way without doing that.
for (i in 1:5) {
if (i==1) {
alldata<-data.frame(IA, rand1) }
else {
alldata<-data.frame(alldata, rand[[i]]) }
}
Instead of the variable names being rand2, rand3, rand4, rand5, they show up as rand..i.., rand..i...1, rand..i...2, and rand..i...3.
Any Suggestions?
You can set variable names using the colnames function. Therefore, your code would look something like:
newdat <- cbind(IA, rand1, rand[2:5])
colnames(newdat) <- c(colnames(IA), paste0("rand", 1:5))
If you're creating your variables in a loop, you can assign the names during the loop
alldata <- data.frame(IA)
for (i in 1:5) {alldata[, paste0('rand', i)] <- rand[[i]]}
However, R is really slow at loops, so if you are trying to do this with tens of thousands of columns, the cbind and rename approach will be much faster.
Just do cbind(IA, rand1, rand[2:5]).
I want to make a loop which contains two variables i,j. for each i equals 1:24, j can be 1:24
but I don't know to make this loop;
i=1
while(i<=24)
{
j=seq(1,24,by=1)
for (j in j)
{
cor[i,j]
}
}
i=i+1
is this right? my output is cor[i,j].
In order to accomplish your final goal try...
cor(myMatrix)
The result is a matrix containing all of the correlations of all of the columns in myMatrix.
If you want to try to go about it the way you were it's probably best to generate a matrix of all of the possible combinations of your items using combn. Try combn(1:4,2) and see what it looks like for a small example. For your example with 24 columns the best way to cycle through all combinations using a for loop is...
myMatrix <- matrix(rnorm(240), ncol = 24)
myIndex <- combn(1:24,2)
for(i in ncol(myIndex)){
temp <- cor(myMatrix[,myIndex[1,i]],myMatrix[,myIndex[2,i]])
print(c(myIndex[,i],temp))
}
So, it's possible to do it with a for loop in R you'd never do it that way.
(and this whole answer is based on a wild guess about what you're actually trying to accomplish because the question, and your comments, are very hard to figure out)