I have a for loop as
for(i in c("a","b","c","d"))
{
as.name(paste("df",i,sep=""))= mydataframe
}
mydataframe is a data frame and I want to create data frames dfa,dfb,dfc and dfd using this loop.
The as.name(paste("df",i,sep="")) does not work here. I do not want to create a list that has the 4 data frames.
Can I directly create 4 data frames from this loop?
You can do this using assign. Although in general, you are better off using lists.
Using your example:
for(i in letters[1:4]){
assign(paste0("df", i), mydataframe)
}
Note that this will simply create the same object 4 times, unless you change what mydataframe is inside the loop.
Related
I am trying to run a for loop over a data frame inside of a list. I run several operations inside the loop and build a new data frame that I add to the list. I want after that, that the loop takes this new data frame that I add to the list and re-run the operations with this new data frame. I want to perform this several times, but I do not know how to do it. My current code looks like the following:
list_r <- list(Original=df)
i <- 1
for (i in seq_along(list_r)){
df2 <- data.frame(x=c(5,5))
list_r[[i+1]] <- df2
i <- i+1
}
This is a simplification of my code, but someone could explain to me why my loop does not run again when I add a data frame into the list? (re-run again the code but now for df2)
I have data that I want to separate by date, I have managed to do this manually through:
tsssplit <- split(tss, tss$created_at)
and then creating dataframes for each list which I then use.
t1 <- tsssplit[[1]]
t2 <- tsssplit[[2]]
But I don't know how many splits I will need, as sometimes the og data frame may may have 6 dates to split up by, and sometimes it may have 5, etc. So I want to create a for loop.
Within the for loop, I want to incorporate this code, which connects to a function:
bscore3 <- score.sentiment(t3$cleaned_text,pos.words,neg.words,.progress='text')
score3 <- as.integer(bscore3$score[[1]])
Then I want to be able to create a new data frame that has the scores for each list.
So essentially I want the for loop to:
split the data into lists using split
split each list into a separate data frames for each different day
Come out with a score for each data frame
Put that into a new data frame
It doesn't have to be exactly like this as long as I can come up with a visualisation of the scores at the end.
Thanks!
It is not recommended to create separate dataframes in the global environment, they are difficult to keep track of. Put them in a list instead. You have started off well by using split and creating list of dataframes. You can then iterate over each dataframe in the list and apply the function on each one of them.
Using by this would look like as :
by(tss, tss$created_at, function(x) {
bscore3 <- score.sentiment(x$cleaned_text,pos.words,neg.words,.progress='text')
score3 <- as.integer(bscore3$score[[1]])
return(score3)
}) -> result
result
I cannot for the life of me figure out where the simple error is in my for loop to perform the same analyses over multiple data frames and output each iteration's new data frame utilizing the variable used along with extra string to identify the new data frame.
Here is my code:
john and jane are 2 data frames among many I am hoping to loop over and compare to bcm to find duplicate results in rows.
x <- list(john,jane)
for (i in x) {
test <- rbind(bcm,i)
test$dups <- duplicated(test$Full.Name,fromLast=T)
test$dups2 <- duplicated(test$Full.Name)
test <- test[which(test$dups==T | test$dups2==T),]
newname <- paste("dupl",i,sep=".")
assign(newname, test)
}
Thus far, I can either get the naming to work correctly without including the x data or the loop to complete correctly without naming the new data frames correctly.
Intended Result: I am hoping to create new data frames dupl.john and dupl.jane to show which rows are duplicated in comparison to bcm.
I understand that lapply() might be better to use and am very open to that form of solution. I could not figure out how to use it to solve my problem, so I turned to the more familiar for loop.
EDIT:
Sorry if I'm not being more clear. I have about 13 data frames in total that I want to run the same analysis over to find the duplicate rows in $Full.Name. I could do the first 4 lines of my loop and then dupl.john <- test 13 times (for each data frame), but I am purposely trying to write a for loop or lapply() to gain more knowledge in R and because I'm sure it is more efficient.
If I understand correctly based on your intended result, maybe using the match_df could be an option.
library(plyr)
dupl.john <- match_df(john, bcm)
dupl.jane <- match_df(jane, bcm)
dupl.john and dupl.jane will be both data frames and both will have the rows that are in these data frames and bcm. Is this what you are trying to achieve?
EDITED after the first comment
library(plyr)
l <- list(john, jane)
res <- lapply(l, function(x) {match_df(x, bcm, on = "Full.Name")} )
dupl.john <- as.data.frame(res[1])
dupl.jane <- as.data.frame(res[2])
Now, res will have a list of the data frames with the matches, based on the column "Full.Name".
i am trying to produce data frames using for loop.
How can i append these data frames to a list and then check if any frame is empty or not ?
I would like to remove the data frames with empty rows from the list.
any help is appreciated
You should use lapply here without using a for loop. The advantages are:
You want to create a list of data.frame and lapply create a list
You do the job once , no need to do 2 loops.
Somethinkg like :
lapply(seq_len(nbr_df),function(x)
{
## code to create you data.frame dt
## dt = data.frame(...)
if(nrow(dt)>0) dt
})
second option: data.frames already created in separate variables:
We assume that your variable have a certain pattern, say patt:
lapply(mget(ls(pattern=patt)),function(x)if(nrow(x)>0)x)
To append to a list you can
Your_list= list()
for(i in numbOfPosibleDF){
k <- data.frame()
if(nrow(k)!=0){
Your_list[paste0(df,i)] = k
}
}
I would just add valid data frames to the list instead of removing them afterwards. If you want or need to use a for-loop (instead of lapply function), you may use following:
# init list
list.of.df <- list()
# start your loop to
# create data frame etc.
# ....
df <- data.frame(1,2)
# add to list
if (!is.null(df) && nrow(df)>0) list.of.df[[length(list.of.df)+1]] <- df
# ... end of loop here.
For the benefit of anyone finding this otherwise dead-end page by its title, the way to concatenate consistently formatted data.frames that are items of a list is with plyr:
rbind.fill.matrix(lst)
I would like to give a better picture of the scenario :
the frames may or may not have same number of columns/rows.
the data frames are dynamically produced using a for loop.
the frames have all data types: numeric , factor, character.
I'm trying to use various data frames within a single for loop, ie:
#after loading the 5 data frames
for(i in 1:5){
dframe <- dataframe[i]
print(sprintf("This is data frame %s", dframe)
}
However this only passes the variable name and not the data frame itself. Thanks.
To obtain the data use the get function.
dframe <- get(dataframe[i])