I am wondering whether it's possible to loop data frames and change the contents of each field.
There are dataframes like df1, df2, df3, ... df100
In each dataframe, there are food columns having a, b
I want to change each a and b in df$food to apple, banana!
for (i in 1:100){
paste('df', i, '$food') <- factor(paste('df', i, '$food'), level = c(a,b), labels = c("apple","banana"))
}
Do you think looping like above is possible?
It would be easier if you put them in a list of dataframes and use lapply.
result <- lapply(mget(paste0('df', 1:100)), function(x) transform(x,
food = factor(food, level=c("a","b"), labels=c("apple","banana"))))
Update the original dataframes back.
list2env(result, .GlobalEnv)
Related
I have a list of dataframes and I want to loop over all dataframes to create new dataframes with only unique values. This is my code for creating 1 new dataframe:
dflist <- list(df1=df1, df2=df2, df3 = df3)
udf1 = unique(df1)
I don't know whether I should use a loop or a function. Any help?
Thanks in advance!
Given that you want to keep the unique rows in each data frame I'd do something like this.
lapply(seq_along(dflist), function(l, n, i) {
assign(paste0(n[[i]]), distinct(l[[i]]), envir = globalenv())
}, l=dflist, n=names(dflist))
I already have a list of data frames (mylist) and need to switch the first and second column for all the data frames in the list.
Test Data Frame in List
[reads] [phylum]
1 phylum1
2 phylum2
3 phylum3
Into....
[phylum] [reads]
phylum1 1
phylum2 2
phylum3 3
I know I need to use lapply, but not sure what to input for the FUN=
mylist <- lapply(mylist, FUN = mylist[ ,c("phylum", "reads")])
errors saying incorrect number of dimensions
Sorry if this is a simple question and thanks in advance for your help!
-Brand new R user
The FUN asks for a function that it can apply to every element in the list. You are passing mylist[ ,c("phylum", "reads")]) which is not a function.
# sample data
df1 <- data.frame(reads = sample(10,4), phylum = sample(10,4))
df2 <- data.frame(reads = sample(10,4), phylum = sample(10,4))
df3 <- data.frame(reads = sample(10,4), phylum = sample(10,4))
df4 <- data.frame(reads = sample(10,4), phylum = sample(10,4))
ldf <- list(df1,df2,df3,df4)
ldf_re <- lapply(ldf, FUN = function(X){X[c('phylum', 'reads')]})
In the last line, the lapply will iterate through all the dataframes, they will be passed as the X argument for the function defined in the FUN argument and the columns will be dataframes will be stored in the list ldf_re with their columns rearranged.
Suppose I have 3 dataframes in the current R environment, named as d1f, df2, df_3. There is no pattern for their names. How can I access one dataframe by its name?
For example, I have a for loop to process the three dataframes. How can I do something like this?
df_names<-c("d1f", "df2", "df_3")
for(name in df_names)
{
df<-some_function(name)
....some action on df....
}
Best is to store the data frames in a list like so:
set.seed(1)
d1f = rnorm(10)
df2 = rnorm(10)
df_3 = rnorm(10)
dfs = list(d1f, df2, df_3)
for (i in 1:length(dfs)){
dfs[[i]] = dfs[[i]] +1 # eg. add 1 to each element of the three data frames
}
I have dataframes and want to pass them as a parameter to process in function. Let say there are 4 dataframes and want to rename first columns to 'ROWNUM'.
df1 = data.frame(c(1:10),sample(1:100,10))
df2 = data.frame(c(1:10),sample(1:100,10))
df3 = data.frame(c(1:10),sample(1:100,10))
df4 = data.frame(c(1:10),sample(1:100,10))
function(df) colnames(df)[1] = 'ROWNUM'
My objective is I want to rename in one shot rather than passing one by one
Thanks.
We can use lapply after keeping the datasets in a list
nm1 <- ls(pattern="df\\d+")
lst <- lapply(mget(nm1), function(x) {
colnames(x)[1] <- 'ROWNUM'
x})
It is better to keep the datasets in a list, but if we need to update the original datasets
list2env(lst, envir=.GlobalEnv)
Or we use assign
for(j in seq_along(nm1)){
assign(nm1[j], `names<-`(get(nm1[j]),
c("ROWNUM", names(get(nm1[j]))[-1])))
}
I have the following named list output from a analysis. The reproducible code is as follows:
list(structure(c(-213.555409754509, -212.033637890131, -212.029474755074,
-211.320398316741, -211.158815833294, -210.470525157849), .Names = c("wasn",
"chappal", "mummyji", "kmph", "flung", "movie")), structure(c(-220.119433774144,
-219.186901747536, -218.743319709963, -218.088361753899, -217.338920075687,
-217.186050877079), .Names = c("crazy", "wired", "skanndtyagi",
"andr", "unveiled", "contraption")))
I want to convert this to a data frame. I have tried unlist to data frame options using reshape2, dplyr and other solutions given for converting a list to a data frame but without much success. The output that I am looking for is something like this:
Col1 Val1 Col2 Val2
1 wasn -213.55 crazy -220.11
2 chappal -212.03 wired -219.18
3 mummyji -212.02 skanndtyagi -218.74
so on and so forth. The actual out put has multiple columns with paired values and runs into many rows. I have tried the following codes already:
do.call(rbind, lapply(df, data.frame, stringsAsFactors = TRUE))
works partially provides all the character values in a column and numeric values in the second.
data.frame(Reduce(rbind, df))
didn't work - provides the names in the first list and numbers from both the lists as tow different rows
colNames <- unique(unlist(lapply(df, names)))
M <- matrix(0, nrow = length(df), ncol = length(colNames),
dimnames = list(names(df), colNames))
matches <- lapply(df, function(x) match(names(x), colNames))
M[cbind(rep(sequence(nrow(M)), sapply(matches, length)),
unlist(matches))] <- unlist(df)
M
didn't work correctly.
Can someone help?
Since the list elements are all of the same length, you should be able to stack them and then combine them by columns.
Try:
do.call(cbind, lapply(myList, stack))
Here's another way:
as.data.frame( c(col = lapply(x, names), val = lapply(x,unname)) )
How it works. lapply returns a list; two lists combined with c make another list; and a list is easily coerced to a data.frame, since the latter is just a list of vectors having the same length.
Better than coercing to a data.frame is just modifying its class, effectively telling the list "you're a data.frame now":
L = c(col = lapply(x, names), val = lapply(x,unname))
library(data.table)
setDF(L)
The result doesn't need to be assigned anywhere with = or <- because L is modified "in place."