Looping through environment objects with a special pattern - r

I have a multiple lists in my environment(all start with "CDS_"). Each list is conducted of multiple sub lists.I want to call the lists one by one to apply a function for each of these objects.
This is what I am trying:
lists<-grep("CDS_",names(.GlobalEnv),value=TRUE) #Lists all objectrs staring with "CDS_"
for (i in seq_along(lists)){
data<-do.call("list",mget(lists[i])) #this line blends all sub lists into one list
assign(paste("Df_", lists[i], sep = "_"), my_function(data) # my_function requires a list with multiple sub lists
}
but the issue is the do.call("list",mget(lists[i])) blends all sub lists into one. For example if there is a list with one sub list it returns the list but all sub lists go into one!
Any solutions how to make this work?
here is a sample to test:
#Defining my_function pulling out the sub list which contains "sample1"
my_function<-function(.data){
# pull out the undergraduate data
grep("sample1", .data, value = TRUE)
}
# 1st list
list_1 <- list(1:54,
c("This","is","sample1","for","list1"),
c("This","is","sample2","for","list1"),
"Hi")
# 2nd list
list_2 <- list(51:120,
c("This","is","sample1","for","list1"),
c("This","is","sample2","for","list1"),
"Bus")
# 3rd list
list_3 <- list(90:120,
letters[16:11],
2025)
lists<-grep("list_",names(.GlobalEnv),value=TRUE)
for (i in seq_along(lists)){
data<-do.call("list",mget(lists[i]))
assign(paste("sample1_", lists[i], sep = ""), my_function(data))
}

As mentioned by #MrFlick, R has a ton of list functionality. It is usually the case that you are better off storing your lists in a list than trying to directly edit them in the environment. Here is one possible solution using base R:
l <- mget(ls(pattern = "^list_\\d$")) # store lists in a list
lapply(l, \(x) lapply(x, my_function))
$list_1
$list_1[[1]]
character(0)
$list_1[[2]]
[1] "sample1"
$list_1[[3]]
character(0)
$list_1[[4]]
character(0)
$list_2
$list_2[[1]]
character(0)
$list_2[[2]]
[1] "sample1"
$list_2[[3]]
character(0)
$list_2[[4]]
character(0)
$list_3
$list_3[[1]]
character(0)
$list_3[[2]]
character(0)
$list_3[[3]]
character(0)
Update
Sticking with base R to remove non-matches you could do:
lapply(l, \(x) Filter(length, lapply(x, my_function)))
$list_1
$list_1[[1]]
[1] "sample1"
$list_2
$list_2[[1]]
[1] "sample1"
$list_3
list()
A purrr solution would be:
library(purrr)
map(map_depth(l, 2, my_function), compact)

When you have lists of lists, and option is rapply, the recursive version of lapply.
my_function<-function(.data){
# pull out the undergraduate data
grep("sample1", .data, value = TRUE)
}
lists <- mget(ls(pattern = "^list_"))
rapply(lists, my_function, how = "list")
#> $list_1
#> $list_1[[1]]
#> character(0)
#>
#> $list_1[[2]]
#> [1] "sample1"
#>
#> $list_1[[3]]
#> character(0)
#>
#> $list_1[[4]]
#> character(0)
#>
#>
#> $list_2
#> $list_2[[1]]
#> character(0)
#>
#> $list_2[[2]]
#> [1] "sample1"
#>
#> $list_2[[3]]
#> character(0)
#>
#> $list_2[[4]]
#> character(0)
#>
#>
#> $list_3
#> $list_3[[1]]
#> character(0)
#>
#> $list_3[[2]]
#> character(0)
#>
#> $list_3[[3]]
#> character(0)
Created on 2022-05-13 by the reprex package (v2.0.1)
Edit
To answer to the OP's comment to another answer, to keep only the matches, save the rapply result and a lapply loop calling lengths, the list version of vector length is used to extract the matches.
r <- rapply(lists, my_function, how = "list")
lapply(r, \(x) x[lengths(x) > 0])
#> $list_1
#> $list_1[[1]]
#> [1] "sample1"
#>
#>
#> $list_2
#> $list_2[[1]]
#> [1] "sample1"
#>
#>
#> $list_3
#> list()
Created on 2022-05-13 by the reprex package (v2.0.1)

Related

How to print a list with out printing line numbers?

I have a list of ggplot objects that I am using the print() function to display. When I do this in r markdown it adds ## $`item name` where "item name" is the name of the object in the list. It adds this before every ggplot object. How do I get rid of this?
you can use unname function to make the list printed as usual
a <- list(`one` = 2 , `two` = 3 , `three` = 4)
print(unname(a))
#> [[1]]
#> [1] 2
#>
#> [[2]]
#> [1] 3
#>
#> [[3]]
#> [1] 4

Apply regmatches function to a list of chr in R

I have this list of character stored in a variable called x:
x <-
c(
"images/logos/france2.png",
"images/logos/cnews.png",
"images/logos/lcp.png",
"images/logos/europe1.png",
"images/logos/rmc-bfmtv.png",
"images/logos/sudradio.png",
"images/logos/franceinfo.png"
)
pattern <- "images/logos/\\s*(.*?)\\s*.png"
regmatches(x, regexec(pattern, x))[[1]][2]
I wish to extract a portion of each chr string according to a pattern, like this function does, which works fine but only for the first item in the list.
pattern <- "images/logos/\\s*(.*?)\\s*.png"
y <- regmatches(x, regexec(pattern, x))[[1]][2]
Only returns:
"france2"
How can I apply the regmatches function to all items in the list in order to get a result like this?
[1] "france2" "europe1" "sudradio"
[4] "cnews" "rmc-bfmtv" "franceinfo"
[7] "lcp" "rmc" "lcp"
FYI this is a list of src tags that comes from a scraper
Try gsub
gsub(
".*/(.*)\\.png", "\\1",
c(
"images/logos/france2.png", "images/logos/cnews.png",
"images/logos/lcp.png", "images/logos/europe1.png",
"images/logos/rmc-bfmtv.png", "images/logos/sudradio.png",
"images/logos/franceinfo.png"
)
)
which gives
[1] "france2" "cnews" "lcp" "europe1" "rmc-bfmtv"
[6] "sudradio" "franceinfo"
Output of regmatches(..., regexec(...)) is a list. You may use sapply to extract the 2nd element from each element of the list.
sapply(regmatches(x, regexec(pattern, x)), `[[`, 2)
#[1] "france2" "europe1" "sudradio" "cnews" "rmc-bfmtv" "franceinfo"
#[7] "lcp" "rmc" "lcp"
You may also use the function basename + file_path_sans_ext from tools package which would give the required output directly.
tools::file_path_sans_ext(basename(x))
#[1] "france2" "europe1" "sudradio" "cnews" "rmc-bfmtv" "franceinfo"
#[7] "lcp" "rmc" "lcp"
A possible solution:
library(tidyverse)
df <- data.frame(
stringsAsFactors = FALSE,
strings = c("images/logos/france2.png","images/logos/cnews.png",
"images/logos/lcp.png","images/logos/europe1.png",
"images/logos/rmc-bfmtv.png","images/logos/sudradio.png",
"images/logos/franceinfo.png")
)
df %>%
mutate(strings = str_remove(strings, "images/logos/") %>%
str_remove("\\.png"))
#> strings
#> 1 france2
#> 2 cnews
#> 3 lcp
#> 4 europe1
#> 5 rmc-bfmtv
#> 6 sudradio
#> 7 franceinfo
Or even simpler:
library(tidyverse)
df %>%
mutate(strings = str_extract(strings, "(?<=images/logos/)(.*)(?=\\.png)"))
#> strings
#> 1 france2
#> 2 cnews
#> 3 lcp
#> 4 europe1
#> 5 rmc-bfmtv
#> 6 sudradio
#> 7 franceinfo

Create dynamic R dataframe names in for loop - multiple names in same code line

I am trying to create dynamic dataframe names within a for loop. I am using the paste function in R to write the dataframe names. see the example below:
for (i in 1:3){
paste("Data",i,sep="") <- data.frame(colone=c(1,2,3,4),coltwo=c(5,6,7,8))
paste("New data",i,sep="") <- paste("Data",i,sep="") %>% mutate(colthree=(colone+coltwo)*i) %>% select(colthree)
}
The code above won't work as R doesn't understand paste as a dataframe name. I have found some solutions using the assign function which could help with my 1st line of code using: assign(paste("Data",i,sep=""),data.frame(colone=c(1,2,3,4),coltwo=c(5,6,7,8))) but I don't know what to do with the 2nd line where the paste function is used twice to refer to multiple dataframes. Not sure using a nested assign function works and even if it does the code will look terrible with more complex code.
I know there might be ideas of how to combine the 2 lines above into a single assign statement or other similar solutions but is there any way to refer to 2 dynamic dataframe names within a single line of code as per my example above?
Many thanks :)
If you need both data frames ("Data i" and "New Data i") you can use:
for (i in 1:3){
assign(paste("New data",i,sep=""), data.frame(assign(paste("Data",i,sep=""),data.frame(colone=c(1,2,3,4),coltwo=c(5,6,7,8))) %>% mutate(colthree=(colone+coltwo)*i) %>% select(colthree)))
}
If you only want "New Data i" use:
for (i in 1:3){
assign(paste("New data",i,sep=""), data.frame(colone=c(1,2,3,4),coltwo=c(5,6,7,8))) %>% mutate(colthree=(colone+coltwo)*i) %>% select(colthree)
}
This seems to be working but it's a little bit convoluted:
library(tidyverse)
library(stringr)
library(rlang)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %#%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#> flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#> splice
i <- seq(1, 3, 1) #how many loops
pwalk(list(paste("Data",i,sep=""), paste("New_data",i,sep=""), i), ~{
assign(..1, data.frame(colone=c(1,2,3,4),coltwo=c(5,6,7,8)), envir = .GlobalEnv)
new <- sym(..1) #convert a string to a variable name
assign(..2, {eval_tidy(new)%>% mutate(colthree=(colone+coltwo)*..3) %>% select(colthree)}, envir = .GlobalEnv)
})
names(.GlobalEnv)
#> [1] "Data1" "Data2" "Data3" "i" "New_data1" "New_data2"
#> [7] "New_data3"
Data1
#> colone coltwo
#> 1 1 5
#> 2 2 6
#> 3 3 7
#> 4 4 8
New_data1
#> colthree
#> 1 6
#> 2 8
#> 3 10
#> 4 12
Created on 2021-06-10 by the reprex package (v2.0.0)

Replacing nested list using a vector of names of depths as an index

Take a simple nested list L:
L <- list(lev1 = list(lev2 = c("bit1","bit2")), other=list(yep=1))
L
#$lev1
#$lev1$lev2
#[1] "bit1" "bit2"
#
#
#$other
#$other$yep
#[1] 1
And a vector giving a series of depths for each part I want to select from L:
sel <- c("lev1","lev2")
The result I want when indexing is:
L[["lev1"]][["lev2"]]
#[1] "bit1" "bit2"
Which I can generalise using Reduce like so:
Reduce(`[[`, sel, init=L)
#[1] "bit1" "bit2"
Now, I want to extend this logic to do a replacement, like so:
L[["lev1"]][["lev2"]] <- "new val"
, but I am genuinely stumped as to how to generate the recursive [[ selection in a way that will allow me to then assign to it as well.
Why cant you just do
L[[sel]] <- "new val"
well if you want to do the long way then
You could still use Reduce with modifyList or you could use [[<-. Here is an example with modifyList:
modifyList(L,Reduce(function(x,y)setNames(list(x),y),rev(sel),init = "new val"))
$lev1
$lev1$lev2
[1] "new val"
$other
$other$yep
[1] 1
You could eval() and parse() by concatenating everything. I am not sure how generalized you could make it:
``` r
L <- list(lev1 = list(lev2 = c("bit1","bit2")), other=list(yep=1))
L
#> $lev1
#> $lev1$lev2
#> [1] "bit1" "bit2"
#>
#>
#> $other
#> $other$yep
#> [1] 1
sel <- c("lev1","lev2")
eval(parse(text = paste0('L', paste0('[["', sel, '"]]', collapse = ''), '<- "new val"')))
L
#> $lev1
#> $lev1$lev2
#> [1] "new val"
#>
#>
#> $other
#> $other$yep
#> [1] 1
Created on 2019-11-25 by the reprex package (v0.3.0)

purr::pmap does not keep parameter names

I'm trying to execute a function that uses the names of passed parameters with purrr::pmap. Unlike purrr::map (see below), pmap doesn't preserve these names. The below MWE captures the issue:
print_names <- function(x) {
print(names(x))
}
namedVec <- c(nameA = "valueA")
purrr::map(list(namedVec), print_names)
# [1] "nameA"
# [[1]]
# [1] "nameA"
purrr::pmap(list(namedVec), print_names)
# NULL
# $nameA
# NULL
Note that, in pmap, the .l argument needs to be a list of listed arguments, but in your function call it's just a list:
print_names <- function(x) {
print(names(x))
}
namedVec <- c(nameA = "valueA")
purrr::map(list(namedVec), ~print_names(.))
#> [1] "nameA"
#> [[1]]
#> [1] "nameA"
purrr::pmap(list(list(namedVec)), print_names)
#> [1] "nameA"
#> [[1]]
#> [1] "nameA"
Created on 2018-10-07 by the reprex package (v0.2.1)

Resources