Apply regmatches function to a list of chr in R - r

I have this list of character stored in a variable called x:
x <-
c(
"images/logos/france2.png",
"images/logos/cnews.png",
"images/logos/lcp.png",
"images/logos/europe1.png",
"images/logos/rmc-bfmtv.png",
"images/logos/sudradio.png",
"images/logos/franceinfo.png"
)
pattern <- "images/logos/\\s*(.*?)\\s*.png"
regmatches(x, regexec(pattern, x))[[1]][2]
I wish to extract a portion of each chr string according to a pattern, like this function does, which works fine but only for the first item in the list.
pattern <- "images/logos/\\s*(.*?)\\s*.png"
y <- regmatches(x, regexec(pattern, x))[[1]][2]
Only returns:
"france2"
How can I apply the regmatches function to all items in the list in order to get a result like this?
[1] "france2" "europe1" "sudradio"
[4] "cnews" "rmc-bfmtv" "franceinfo"
[7] "lcp" "rmc" "lcp"
FYI this is a list of src tags that comes from a scraper

Try gsub
gsub(
".*/(.*)\\.png", "\\1",
c(
"images/logos/france2.png", "images/logos/cnews.png",
"images/logos/lcp.png", "images/logos/europe1.png",
"images/logos/rmc-bfmtv.png", "images/logos/sudradio.png",
"images/logos/franceinfo.png"
)
)
which gives
[1] "france2" "cnews" "lcp" "europe1" "rmc-bfmtv"
[6] "sudradio" "franceinfo"

Output of regmatches(..., regexec(...)) is a list. You may use sapply to extract the 2nd element from each element of the list.
sapply(regmatches(x, regexec(pattern, x)), `[[`, 2)
#[1] "france2" "europe1" "sudradio" "cnews" "rmc-bfmtv" "franceinfo"
#[7] "lcp" "rmc" "lcp"
You may also use the function basename + file_path_sans_ext from tools package which would give the required output directly.
tools::file_path_sans_ext(basename(x))
#[1] "france2" "europe1" "sudradio" "cnews" "rmc-bfmtv" "franceinfo"
#[7] "lcp" "rmc" "lcp"

A possible solution:
library(tidyverse)
df <- data.frame(
stringsAsFactors = FALSE,
strings = c("images/logos/france2.png","images/logos/cnews.png",
"images/logos/lcp.png","images/logos/europe1.png",
"images/logos/rmc-bfmtv.png","images/logos/sudradio.png",
"images/logos/franceinfo.png")
)
df %>%
mutate(strings = str_remove(strings, "images/logos/") %>%
str_remove("\\.png"))
#> strings
#> 1 france2
#> 2 cnews
#> 3 lcp
#> 4 europe1
#> 5 rmc-bfmtv
#> 6 sudradio
#> 7 franceinfo
Or even simpler:
library(tidyverse)
df %>%
mutate(strings = str_extract(strings, "(?<=images/logos/)(.*)(?=\\.png)"))
#> strings
#> 1 france2
#> 2 cnews
#> 3 lcp
#> 4 europe1
#> 5 rmc-bfmtv
#> 6 sudradio
#> 7 franceinfo

Related

How to replace characters in a string one at a time generating new string for each replacement?

I have a vector of strings
c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK",
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", "SYASDFGSSAK",
"LYSYYSSTESK")
for each string I would like to replace "Y", "S" or "T" with "pY", "pS" or "pT". But I dont want all the replacements to be in the same final string, I want each replacement to generate a new string, e.g.
"YSAHEEHHYDK" turns into
c("pYSAHEEHHYDK",
"YpSAHEEHHYDK",
"YSAHEEHHpYDK")
Using xx input in the Note at the end (which is as in the question plus some border tests) we use stringi functions. In particular note that stri_sub can insert a p character. If an input string is empty, i.e. "", or does not contain any of Y, S or T then NA is returned for that string.
library(stringi)
add_p <- function(s, loc) {
start <- loc[, "start"]
stri_sub(s, start, start-1) <- "p"
s
}
Map(add_p, xx, stri_locate_all(xx, regex = "[YST]"))
giving
[1] NA
$ABC
[1] NA
$YSAHEEHHYDK
[1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
$HEHISSDYAGK
[1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
$TFAHTESHISK
[1] "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
# ...snip...
Note
This is the same as in the question exceept we have added the first two strings.
xx <- c("", "ABC", "YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK",
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", "SYASDFGSSAK",
"LYSYYSSTESK")
You could write a function in base R:
Edit:
Included the notion of zero-length as shown by #GKi
strings <- c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK",
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK",
"SYASDFGSSAK", "LYSYYSSTESK")
reg <- gregexpr("[YST]", strings)
`regmatches<-`(rep(strings, lengths(reg)),
`attr<-`(unlist(reg), "match.length", 0), value = 'p')
#> [1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK" "HEHIpSSDYAGK" "HEHISpSDYAGK"
#> [6] "HEHISSDpYAGK" "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
#> [11] "IpSLGEHEGGGK" "LpSSGYDGTSYK" "LSpSGYDGTSYK" "LSSGpYDGTSYK" "LSSGYDGpTSYK"
#> [16] "LSSGYDGTpSYK" "LSSGYDGTSpYK" "FGpTGTYAGGEK" "FGTGpTYAGGEK" "FGTGTpYAGGEK"
#> [21] "VGApSTGYSGLK" "VGASpTGYSGLK" "VGASTGpYSGLK" "VGASTGYpSGLK" "pTASGVGGFSTK"
#> [26] "TApSGVGGFSTK" "TASGVGGFpSTK" "TASGVGGFSpTK" "pSYASDFGSSAK" "SpYASDFGSSAK"
#> [31] "SYApSDFGSSAK" "SYASDFGpSSAK" "SYASDFGSpSAK" "LpYSYYSSTESK" "LYpSYYSSTESK"
#> [36] "LYSpYYSSTESK" "LYSYpYSSTESK" "LYSYYpSSTESK" "LYSYYSpSTESK" "LYSYYSSpTESK"
#> [41] "LYSYYSSTEpSK"
Created on 2023-02-14 with reprex v2.0.2
You can create a small function to help you out.
my_replace <- function(x){
reg <- gregexpr("[YST]", x)
`regmatches<-`(rep(x, lengths(reg)), structure(unlist(reg), match.length = 0), value = "p")
}
Perhaps something like this with stringr and purrr.
str_locate_all() returns a 2-column matrix with start & end of pattern locations, str_sub(string, start) <- "p" conveniently accepts that same matrix for a start. Subtracting 1 from current end-column (i.e [1, 1] becomes [1, 0]) keeps all existing characters and inserts p.
library(stringr)
library(purrr)
str_ <- c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK",
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK",
"SYASDFGSSAK", "LYSYYSSTESK")
map2(set_names(str_),
str_locate_all(str_,"Y|S|T"),
function(x, y) {
y[,2] <- y[,2] - 1
str_sub(x, y) <- "p"
x
})
Result as a named list:
#> $YSAHEEHHYDK
#> [1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
#>
#> $HEHISSDYAGK
#> [1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
#>
#> $TFAHTESHISK
#> [1] "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
#>
#> $ISLGEHEGGGK
#> [1] "IpSLGEHEGGGK"
#>
#> $LSSGYDGTSYK
#> [1] "LpSSGYDGTSYK" "LSpSGYDGTSYK" "LSSGpYDGTSYK" "LSSGYDGpTSYK" "LSSGYDGTpSYK"
#> [6] "LSSGYDGTSpYK"
#>
#> $FGTGTYAGGEK
#> [1] "FGpTGTYAGGEK" "FGTGpTYAGGEK" "FGTGTpYAGGEK"
#>
#> $VGASTGYSGLK
#> [1] "VGApSTGYSGLK" "VGASpTGYSGLK" "VGASTGpYSGLK" "VGASTGYpSGLK"
#>
#> $TASGVGGFSTK
#> [1] "pTASGVGGFSTK" "TApSGVGGFSTK" "TASGVGGFpSTK" "TASGVGGFSpTK"
#>
#> $SYASDFGSSAK
#> [1] "pSYASDFGSSAK" "SpYASDFGSSAK" "SYApSDFGSSAK" "SYASDFGpSSAK" "SYASDFGSpSAK"
#>
#> $LYSYYSSTESK
#> [1] "LpYSYYSSTESK" "LYpSYYSSTESK" "LYSpYYSSTESK" "LYSYpYSSTESK" "LYSYYpSSTESK"
#> [6] "LYSYYSpSTESK" "LYSYYSSpTESK" "LYSYYSSTEpSK"
Created on 2023-02-15 with reprex v2.0.2
A base variant similar to the method from #G.Grothendieck and #margusl using gregexpr to find the positions of Y, S or T and using regmatches<-, like #onyambu, to insert p at this positions.
sIn <- function(s, i) {
`regmatches<-`(rep(s, length(i)), `attr<-`(i, "match.length", 0), value="p")
}
Map(sIn, s, gregexpr("[YST]", s))
#[[1]]
#[1] ""
#
#$ABC
#[1] "ABC"
#
#$YSAHEEHHYDK
#[1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
#
#$HEHISSDYAGK
#[1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
#...
Or using str_sub<- and str_locate_all from stringr with a non consuming look ahead (?=[YST]).
library(stringr)
Map(`str_sub<-`, s, str_locate_all(s,"(?=[YST])"), value="p")
#[[1]]
#character(0)
#
#$ABC
#character(0)
#
#$YSAHEEHHYDK
#[1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
#
#$HEHISSDYAGK
#[1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
#...
Or the same but using stringi.
library(stringi)
Map(`stri_sub<-`, s, stri_locate_all(s, regex="(?=[YST])"), value="p")
#[[1]]
#[1] NA
#
#$ABC
#[1] NA
#
#$YSAHEEHHYDK
#[1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
#
#$HEHISSDYAGK
#[1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
#...
Data (added the first two strings like #G.Grothendieck)
s <- c("", "ABC", "YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK",
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK",
"SYASDFGSSAK", "LYSYYSSTESK")

Looping through environment objects with a special pattern

I have a multiple lists in my environment(all start with "CDS_"). Each list is conducted of multiple sub lists.I want to call the lists one by one to apply a function for each of these objects.
This is what I am trying:
lists<-grep("CDS_",names(.GlobalEnv),value=TRUE) #Lists all objectrs staring with "CDS_"
for (i in seq_along(lists)){
data<-do.call("list",mget(lists[i])) #this line blends all sub lists into one list
assign(paste("Df_", lists[i], sep = "_"), my_function(data) # my_function requires a list with multiple sub lists
}
but the issue is the do.call("list",mget(lists[i])) blends all sub lists into one. For example if there is a list with one sub list it returns the list but all sub lists go into one!
Any solutions how to make this work?
here is a sample to test:
#Defining my_function pulling out the sub list which contains "sample1"
my_function<-function(.data){
# pull out the undergraduate data
grep("sample1", .data, value = TRUE)
}
# 1st list
list_1 <- list(1:54,
c("This","is","sample1","for","list1"),
c("This","is","sample2","for","list1"),
"Hi")
# 2nd list
list_2 <- list(51:120,
c("This","is","sample1","for","list1"),
c("This","is","sample2","for","list1"),
"Bus")
# 3rd list
list_3 <- list(90:120,
letters[16:11],
2025)
lists<-grep("list_",names(.GlobalEnv),value=TRUE)
for (i in seq_along(lists)){
data<-do.call("list",mget(lists[i]))
assign(paste("sample1_", lists[i], sep = ""), my_function(data))
}
As mentioned by #MrFlick, R has a ton of list functionality. It is usually the case that you are better off storing your lists in a list than trying to directly edit them in the environment. Here is one possible solution using base R:
l <- mget(ls(pattern = "^list_\\d$")) # store lists in a list
lapply(l, \(x) lapply(x, my_function))
$list_1
$list_1[[1]]
character(0)
$list_1[[2]]
[1] "sample1"
$list_1[[3]]
character(0)
$list_1[[4]]
character(0)
$list_2
$list_2[[1]]
character(0)
$list_2[[2]]
[1] "sample1"
$list_2[[3]]
character(0)
$list_2[[4]]
character(0)
$list_3
$list_3[[1]]
character(0)
$list_3[[2]]
character(0)
$list_3[[3]]
character(0)
Update
Sticking with base R to remove non-matches you could do:
lapply(l, \(x) Filter(length, lapply(x, my_function)))
$list_1
$list_1[[1]]
[1] "sample1"
$list_2
$list_2[[1]]
[1] "sample1"
$list_3
list()
A purrr solution would be:
library(purrr)
map(map_depth(l, 2, my_function), compact)
When you have lists of lists, and option is rapply, the recursive version of lapply.
my_function<-function(.data){
# pull out the undergraduate data
grep("sample1", .data, value = TRUE)
}
lists <- mget(ls(pattern = "^list_"))
rapply(lists, my_function, how = "list")
#> $list_1
#> $list_1[[1]]
#> character(0)
#>
#> $list_1[[2]]
#> [1] "sample1"
#>
#> $list_1[[3]]
#> character(0)
#>
#> $list_1[[4]]
#> character(0)
#>
#>
#> $list_2
#> $list_2[[1]]
#> character(0)
#>
#> $list_2[[2]]
#> [1] "sample1"
#>
#> $list_2[[3]]
#> character(0)
#>
#> $list_2[[4]]
#> character(0)
#>
#>
#> $list_3
#> $list_3[[1]]
#> character(0)
#>
#> $list_3[[2]]
#> character(0)
#>
#> $list_3[[3]]
#> character(0)
Created on 2022-05-13 by the reprex package (v2.0.1)
Edit
To answer to the OP's comment to another answer, to keep only the matches, save the rapply result and a lapply loop calling lengths, the list version of vector length is used to extract the matches.
r <- rapply(lists, my_function, how = "list")
lapply(r, \(x) x[lengths(x) > 0])
#> $list_1
#> $list_1[[1]]
#> [1] "sample1"
#>
#>
#> $list_2
#> $list_2[[1]]
#> [1] "sample1"
#>
#>
#> $list_3
#> list()
Created on 2022-05-13 by the reprex package (v2.0.1)

Is there a specific function in R to merge 2 vectors [duplicate]

This question already has answers here:
Pasting two vectors with combinations of all vectors' elements
(8 answers)
Closed 2 years ago.
I have two vectors, one that contains a list of variables, and one that contains dates, such as
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
I want to merge them to have a vector with each variable indexed by my date, that is my desired output is
> Colonnes_Pays_Principaux
[1] "PIB_2020" "PIB_2021" "ConsommationPrivee_2020"
[4] "ConsommationPrivee_2021" "ConsommationPubliques_2020" "ConsommationPubliques_2021"
[7] "FBCF_2020" "FBCF_2021" "ProductionIndustrielle_2020"
[10] "ProductionIndustrielle_2021" "Inflation_2020" "Inflation_2021"
[13] "InflationSousJacente_2020" "InflationSousJacente_2021" "PrixProductionIndustrielle_2020"
[16] "PrixProductionIndustrielle_2021" "CoutHoraireTravail_2020" "CoutHoraireTravail_2021"
Is there a simpler / more readabl way than a double for loop as I have tried and succeeded below ?
Colonnes_Pays_Principaux <- vector()
for (Variable in (1:length(Variables_Pays))){
for (Annee in (1:length(Annee_Pays))){
Colonnes_Pays_Principaux=
append(Colonnes_Pays_Principaux,
paste(Variables_Pays[Variable],Annee_Pays[Annee],sep="_")
)
}
}
expand.grid will create a data frame with all combinations of the two vectors.
with(
expand.grid(Variables_Pays, Annee_Pays),
paste0(Var1, "_", Var2)
)
#> [1] "PIB_2000" "ConsommationPrivee_2000"
#> [3] "ConsommationPubliques_2000" "FBCF_2000"
#> [5] "ProductionIndustrielle_2000" "Inflation_2000"
#> [7] "InflationSousJacente_2000" "PrixProductionIndustrielle_2000"
#> [9] "CoutHoraireTravail_2000" "PIB_2001"
#> [11] "ConsommationPrivee_2001" "ConsommationPubliques_2001"
#> [13] "FBCF_2001" "ProductionIndustrielle_2001"
#> [15] "Inflation_2001" "InflationSousJacente_2001"
#> [17] "PrixProductionIndustrielle_2001" "CoutHoraireTravail_2001"
We can use outer :
c(t(outer(Variables_Pays, Annee_Pays, paste, sep = '_')))
# [1] "PIB_2000" "PIB_2001"
# [3] "ConsommationPrivee_2000" "ConsommationPrivee_2001"
# [5] "ConsommationPubliques_2000" "ConsommationPubliques_2001"
# [7] "FBCF_2000" "FBCF_2001"
# [9] "ProductionIndustrielle_2000" "ProductionIndustrielle_2001"
#[11] "Inflation_2000" "Inflation_2001"
#[13] "InflationSousJacente_2000" "InflationSousJacente_2001"
#[15] "PrixProductionIndustrielle_2000" "PrixProductionIndustrielle_2001"
#[17] "CoutHoraireTravail_2000" "CoutHoraireTravail_2001"
No real need to go beyond the basics here! Use paste for pasting the strings and rep to repeat either Annee_Pays och Variables_Pays to get all combinations:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
# To get this is the same order as in your example:
paste(rep(Variables_Pays, rep(2, length(Variables_Pays))), Annee_Pays, sep = "_")
# Alternative order:
paste(Variables_Pays, rep(Annee_Pays, rep(length(Variables_Pays), 2)), sep = "_")
# Or, if order doesn't matter too much:
paste(Variables_Pays, rep(Annee_Pays, length(Variables_Pays)), sep = "_")
In base R:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
cbind(paste(Variables_Pays, Annee_Pays,sep="_"),paste(Variables_Pays, rev(Annee_Pays),sep="_")

Replacing nested list using a vector of names of depths as an index

Take a simple nested list L:
L <- list(lev1 = list(lev2 = c("bit1","bit2")), other=list(yep=1))
L
#$lev1
#$lev1$lev2
#[1] "bit1" "bit2"
#
#
#$other
#$other$yep
#[1] 1
And a vector giving a series of depths for each part I want to select from L:
sel <- c("lev1","lev2")
The result I want when indexing is:
L[["lev1"]][["lev2"]]
#[1] "bit1" "bit2"
Which I can generalise using Reduce like so:
Reduce(`[[`, sel, init=L)
#[1] "bit1" "bit2"
Now, I want to extend this logic to do a replacement, like so:
L[["lev1"]][["lev2"]] <- "new val"
, but I am genuinely stumped as to how to generate the recursive [[ selection in a way that will allow me to then assign to it as well.
Why cant you just do
L[[sel]] <- "new val"
well if you want to do the long way then
You could still use Reduce with modifyList or you could use [[<-. Here is an example with modifyList:
modifyList(L,Reduce(function(x,y)setNames(list(x),y),rev(sel),init = "new val"))
$lev1
$lev1$lev2
[1] "new val"
$other
$other$yep
[1] 1
You could eval() and parse() by concatenating everything. I am not sure how generalized you could make it:
``` r
L <- list(lev1 = list(lev2 = c("bit1","bit2")), other=list(yep=1))
L
#> $lev1
#> $lev1$lev2
#> [1] "bit1" "bit2"
#>
#>
#> $other
#> $other$yep
#> [1] 1
sel <- c("lev1","lev2")
eval(parse(text = paste0('L', paste0('[["', sel, '"]]', collapse = ''), '<- "new val"')))
L
#> $lev1
#> $lev1$lev2
#> [1] "new val"
#>
#>
#> $other
#> $other$yep
#> [1] 1
Created on 2019-11-25 by the reprex package (v0.3.0)

R For loop unwanted overwrite

I would like every result of the loop in a different text(somename).
Right now the loop overwrites;
library(rvest)
main.page <- read_html(x = "http://www.imdb.com/event/ev0000681/2016")
urls <- main.page %>% # feed `main.page` to the next step
html_nodes(".alt:nth-child(2) strong a") %>% # get the CSS nodes
html_attr("href") # extract the URLs
for (i in urls){
a01 <- paste0("http://www.imdb.com",i)
text <- read_html(a01) %>% # load the page
html_nodes(".credit_summary_item~ .credit_summary_item+ .credit_summary_item .itemprop , .summary_text+ .credit_summary_item .itemprop") %>% # isloate the text
html_text()
}
How could I code it in such a way that the 'i' from the list is added tot text in the for statement?
To solidify my comment:
main.page <- read_html(x = "http://www.imdb.com/event/ev0000681/2016")
urls <- main.page %>% # feed `main.page` to the next step
html_nodes(".alt:nth-child(2) strong a") %>% # get the CSS nodes
html_attr("href") # extract the URLs
texts <- sapply(head(urls, n = 3), function(i) {
read_html(paste0("http://www.imdb.com", i)) %>%
html_nodes(".credit_summary_item~ .credit_summary_item+ .credit_summary_item .itemprop , .summary_text+ .credit_summary_item .itemprop") %>%
html_text()
}, simplify = FALSE)
str(texts)
# List of 3
# $ /title/tt5843990/: chr [1:4] "Lav Diaz" "Charo Santos-Concio" "John Lloyd Cruz" "Michael De Mesa"
# $ /title/tt4551318/: chr [1:4] "Andrey Konchalovskiy" "Yuliya Vysotskaya" "Peter Kurth" "Philippe Duquesne"
# $ /title/tt4550098/: chr [1:4] "Tom Ford" "Amy Adams" "Jake Gyllenhaal" "Michael Shannon"
If you use lapply(...), you'll get an unnamed list, which may or may not be a problem for you. Instead, using sapply(..., simplify = FALSE), we get a named list where each name is (in this case) the partial url retrieved from urls.
Using sapply without simplify can lead to unexpected outputs. As an example:
set.seed(9)
sapply(1:3, function(i) rep(i, sample(3, size=1)))
# [1] 1 2 3
One may think that this will always return a vector. However, if any of the single elements returned is not the same length (for instance) as the others, then the vector becomes a list:
set.seed(10)
sapply(1:3, function(i) rep(i, sample(3, size=1)))
# [[1]]
# [1] 1 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3 3
In which case, it's best to have certainty in the return value, forcing a list:
set.seed(9)
sapply(1:3, function(i) rep(i, sample(3, size=1)), simplify = FALSE)
# [[1]]
# [1] 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3
That way, you always know exactly how to reference sub-returns. (This is one of the tenets and advantages to Hadley's purrr package: each function always returns a list of exactly the type you declare. (There are other advantages to the package.)

Resources