R Combine more than two lists elements with RegEx - r

I have multiple lists starting with the same name.
(values_1, values_2,values_n)
Is there a way to combine them like
all_lists <- c(values_*)

As suggested by Ronak Shah comment:
You have to work with the global environment .GlobalEnv
The function ls returns all the objects already defined in the .GlobalEnv
The pattern parameter allows you to obtain only objects which match the pattern.
ls() returns a character vector with the names of the objects.
To access the value of objects with their names, you have to use the get() function
When you have multiple names, you can use mget(). So the final snippet is
list_data <- mget(ls(pattern = 'values_'))
If you want to do the same with dataframes
Here is a working example:
mtc_1 <- mtcars
mtc_2 <- mtcars
mtc_3 <- mtcars
list_data <- mget(ls(pattern = 'mtc_'))
do.call(rbind, list_data)

Related

Apply an `as.character()` function to a list of dataframes

So essentially I have a list of dataframes that I want to apply as.character() to.
To obtain the list of dataframes I have a list of files that I read in using a map() function and a read funtion that I created. I can't use map_df() because there are columns that are being read in as different data types. All of the files are the same and I know that I could hard code the data types in the read function if I wanted, but I want to avoid that if I can.
At this point I throw the list of dataframes in a for loop and apply another map() function to apply the as.character() function. This final list of dataframes is then compressed using bind_rows().
All in all, this seems like an extremely convoluted process, see code below.
audits <- list.files()
my_reader <- function(x) {
my_file <- read_xlsx(x)
}
audits <- map(audits, my_reader)
for (i in 1:length(audits)) {
audits[[i]] <- map_df(audits[[i]], as.character)
}
audits <- bind_rows(audits)
Does anybody have any ideas on how I can improve this? Ideally to the point where I can do everything in a single vectorised map() function?
For reproducibility you can use two iris datasets with one of the columns datatypes changed.
iris2 <- iris
iris2[1] <- as.character(iris2[1])
my_list <- list(iris, iris2)
as.character works on vector whereas data.frame is a list of vectors. An option is to use across if we want only a single use of map
library(dplyr)
library(purrr)
map_dfr(my_list, ~ .x %>%
mutate(across(everything(), as.character)))
I wanted to show a base R solution just incase if it helps anyone else. You can use rapply to recursively go through the list and apply a function. you can specify class and if you want to replace or unlist/list the returned object:
iris2 <- iris
iris2[1] <- as.character(iris2[1])
my_list <- list(iris, iris2)
mylist2 <- rapply(my_list, class = "ANY", f = as.character, how = "replace")
bigdf <- do.call(rbind, mylist2)

A Function to Merge 100 Dataframes to one Dataframe

I am new to programming and R is my first programming language to learn.
I want to merge 100 dataframes; each dataframe contains one column and 20 observations, as shown below:
df1 <- as.data.frame(c(6,3,4,4,5,...))
df2 <- as.data.frame(c(2,2,3,5,10,...))
df3 <- as.data.frame(c(5,9,2,3,7,...))
...
df100 <- as.data.frame(c(4,10,5,9,8,...))
I tried using df.list <- list(df1:df100) to construct an overall dataframe for all of the dataframes but I am not sure if df.list merges all the columns from all the dataframes together in a table.
Can anyone tell me if I am right? And what do I need to do?
We can use mget to get all the objects into a list by specifying the pattern in 'ls' to check for object names that starts (^) with 'df' followed by one or mor digits (\\d+) till the end ($) of the string
df.list <- mget(ls(pattern = '^df\\d+$'))
From the list, if we can want to cbind all the datasets, use cbind in do.call
out <- do.call(cbind, df.list)
NOTE: It is better not to create multiple objects in the global environment. We could have read all the data into a list directly or constructed within a list i.e. if the files are read from .csv, get all the files with .csv from the directory of interest with list.files, then loop over the files in lapply, read them individually with read.csv and cbind
files <- list.files(path = 'path/to/your/location',
pattern = '\\.csv$', full.names = TRUE)
out <- do.call(cbind, lapply(files, read.csv))
We can also use reduce function from purrr package, after creating a character vector of names of data frames:
library(dplyr)
library(purrr)
names <- paste0("df", 1:100)
names %>%
reduce(.init = get(names[1]), ~ bind_rows(..1, get(..2)))
Or in base R:
Reduce(function(x, y) rbind(x, get(y)), names, init = get(names[1]))

do.call skip error and continue processing

After a for loop I create 4 dataframes (data1, data2,data3,data4), i want to rbind all of them.
I tried:
do.call(rbind, mget(paste0("data", 1:4)))
but sometimes, the for loop gives me only 3 of them, for example: data1, data2, data4.
it seems that do.call doesn't know how to handle this issue.
How could I do to still have an rbind of data1, data2, data4?
You can get all your objects from the global environment (via ls()) and use grep to get the ones that follow the pattern you need, i.e.
do.call(rbind, mget(grep('data[0-9]+', ls(), value = TRUE)))
Maybe check if dataframe exists in the environment and mget only those.
data_names <- paste0("data", 1:4)
do.call(rbind, mget(data_names[sapply(data_names, exists)]))
You can use pattern matching mechanism in ls to identify your objects, as mget takes character vector of object names and pattern argument in ls can use regular expression, which is more flexible than generating object names via paste.
data_cars_one <- mtcars
data_cars_two <- mtcars
library(tidyverse)
res_all <- bind_rows(mget(x = ls(pattern = "^data")))
Concerning the binding, I've used bind_rows just as an alternative to do.call and Reduce solutions.

Group R objects into a list

I have loaded a series of SpatialPolygonsDataFrames into my workspace. Each of the named objects has either "_adm0", "_adm1", or "_adm2" attached to the country abreviation. For Germany, this would look like "DEU_adm0", "DEU_adm1", and "DEU_adm2".
I'm trying to gather all of the "_adm0" data frames into a list which can then be operated on by ldply and fortify. I could do that with,
mylist <- list(DEU_adm0, FRA_adm0, RUS_adm0, etc...) where I write out all of the countries that I want to be included in the list.
But, how do I grab all of the "_adm0" data frames by a pattern?
I have started with the code below but it doesn't give me the desired result as writing out
adm0list <- ls()[str_detect(ls(), "_adm0")]
mylist <- sapply(adm0list, function(x) get(x))
or alternatively,
mylist <- mget(adm0list, .GlobalEnv)
I do get a list of objects with the sapply method, and using mget(), but I'm not seeing why those lists are different than using list() with the object names directly. I suspect the answer to that question will tell me why ldply + fortify works with the list()method but not the other two.
You could use the pattern argument of ls and then use the # extractor for the data.frame portion of your SPDF objects...
# Construct list of objects wtih mget
ll <- mget( ls( pattern = "_adm0" ) )
# Extract data.frames
out <- lapply( ll , function(x) x#data )

Apply an already defined function to all dataframes at once [duplicate]

This question already has an answer here:
How to apply a function to a certain column for all the data frames in environment in R
(1 answer)
Closed 1 year ago.
I already have defined a function (which works fine). Nevertheless, I have 20 dataframes in the working space to which I want to lapply the same function (dat1 to dat20).
So far it looks like this:
dat1 <- func(dat=dat1)
dat2 <- func(dat=dat2)
dat3 <- func(dat=dat3)
dat4 <- func(dat=dat4)
...
dat20 <- func(dat=dat20)
However, is there a way to do this more elegant with a shorter command, i.e. to lapply the function to all dataframes at once?
I tried this, but it didn't work:
mylist <- paste0("dat", 1:20, sep="")
lapply(mylist, func)
Try something like:
lapply(mget(ls(pattern="dat")),func)
Some details: The pattern argument in ls will limit which object names it lists (e.g., I assume you have other objects including your function in the global environment). mget retrieves those objects from the environment and turns them into a list, which you can then lapply your function over.
If you have the name of a variable, you can use get() to retrieve the value from the workspace. The corresponding assignment function is called assign():
mylist <- paste0("dat", 1:20)
lapply(mylist, function(name) assign(name, func(dat=get(name))) )
The desired behavior can be obtained using eval instead of lapply.
Assume mylist to be the names of the data.frame you want to apply fun to. mylist might be generated using
mylist <- ls(pattern="dat")
Then you can use the following code to do exactly what you want:
cCmd <- paste(mylist , "<- func(" ,mylist,")", sep="")
eCmd <- parse(text=cCmd)
eval(eCmd)

Resources