How do I replicate a nested for loop with mapply? - r

I would like to vectorize the creation of a list in R, but can only get what I want with a nested for loop. I've included a vastly simplified version of my problem for reproducibility. Can someone help me to modify or replace my mapply function?
Desired functionality:
my_list <- list()
A <- c("one", "two", "three", "four")
B <- c("left", "right")
for (a in A) {
for (b in B) {
my_list <- c(my_list, paste(a, b))
}
}
print(my_list)
output (edited white space for brevity):
[[1]] [1] "one left"
[[2]] [1] "one right"
[[3]] [1] "two left"
[[4]] [1] "two right"
[[5]] [1] "three left"
[[6]] [1] "three right"
[[7]] [1] "four left"
[[8]] [1] "four right"
My attempt to vectorize this:
combinate <- function(a, b) {
return(paste(a, b))
}
mapply(combinate, a=A, b=B, SIMPLIFY=FALSE)
output:
$one [1] "one left"
$two [1] "two right"
$three [1] "three left"
$four [1] "four right"
I'm not concerned about labels; I'm concerned about getting all eight results from looping over both lists. I have found documentation that mapply is doing exactly what it is supposed to by pairing the first items from both lists, then the second items from both lists, etc. repeating shorter lists. But after much searching, I can't find what must be there, a way to pair all list items combinatorically like the nested for loop.

We can do with expand.grid and paste
v1 <- do.call(paste, expand.grid(A, B))
Or with outer
v1 <- c(outer(A, B, paste))
If these needs to be in a list
as.list(v1)
Checking with the OP's output
identical(as.list( c(t(outer(A, B, paste)))), my_list)
#[1] TRUE

Related

How to access all sub list elements in R at once? [duplicate]

This question already has answers here:
Select first element of nested list
(5 answers)
R list get first item of each element
(2 answers)
Closed 3 years ago.
I have a splitted string of a vector like
df <- c("Test A:No1", "Test B:No2")
l <- str_split(df, ":")
l
which returns me
[[1]]
[1] "Test A" "No1"
[[2]]
[1] "Test B" "No2"
Now I am interested in accessing all first elements and all last elements independently or create a vector like
[1] "Test A" "Test B"
and
[1] "No1" "No2"
I tried several types of single and double brackets, with and without commas, but l[[x]][1] or l[[x]][2] give me only the list element x.
How can I access all elements at once (e.g. l[[]][1] )?
You may use sapply.
sapply(l, `[`, 1)
# [1] "Test A" "Test B"
sapply(l, `[`, 2)
# [1] "No1" "No2"
Explanation: In R quite everything is a function. Also the parentheses `[` actually are functions. Considering following example makes clear why the sapply above works.
Example
Consider this vector
x <- c("A", "B")
Whey we're doing
x[1]
# [1] "A"
x[2]
# [2] "B"
we're actually applying the special form of the underlying prefix-form of the `[` function:
`[`(x, 1)
# [1] "A"
`[`(x, 2)
# [1] "B"
maybe using unlist and lapply can get the work done.
df <- c("Test A:No1", "Test B:No2")
l <- str_split(df, ":")
> unlist(lapply(l,function(x) x[1]))
[1] "Test A" "Test B"
> unlist(lapply(l,function(x) x[length(x)]))
[1] "No1" "No2"

How to perform a vectorised operation with a self-defined function, adding the results to a list?

library(tidyverse)
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
test <- ridiculous_function("apple", "A")
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
This code produces a list of elements of a and b, however what I would like is to run the function over two vectors in parallel, and then put all of the results in the same list.
For example, with these two vectors:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
I would want to create a list which produces elements of character vectors for "apple", "A", "apricot", "B", "avocado", "C".. and so on. My real scenario is a lot more complex so I need a solution which works with the confines of my function.
Expected output:
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
....
[[19]]
[1] "blueberry"
[[20]]
[1] "T"
How about:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
library(tidyverse)
flatten(map2(fruits10, letters10, ridiculous_function))
which gives you
[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
[[7]]
[1] "banana"
[[8]]
[1] "D"
etc...
Here are a few different ways of doing this:
library(tidyverse)
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
# using mapply, base R for writing packages
mapply(ridiculous_function, fruits10, letters10) %>%
split(rep(1:ncol(.), each = nrow(.)))
# using map2, takes two args
map2(fruits10, letters10, ridiculous_function)
# using pmap, can take as many args as you want
list(a = fruits10,
b = letters10) %>%
pmap(ridiculous_function)
You ask for results in a flat list format, so you can pop a flatten at the end
of each of these, but usually you would want to retain the list structure.

R loop over two or more vectors simultaneously - paralell

I was looking for method to iterate over two or more character vectors/list in R simultaneously ex. is it some way to do something like:
foo <- c('a','c','d')
bar <- c('aa','cc','dd')
for(i in o){
print(o[i], p[i])
}
Desired result:
'a', 'aa'
'c', 'cc'
'd', 'dd'
In Python we can do simply:
foo = ('a', 'c', 'd')
bar = ('aa', 'cc', 'dd')
for i, j in zip(foo, bar):
print(i, j)
But can we do this in R?
Like this?
foo <- c('a','c','d')
bar <- c('aa','cc','dd')
for (i in 1:length(foo)){
print(c(foo[i],bar[i]))
}
[1] "a" "aa"
[1] "c" "cc"
[1] "d" "dd"
Works under the condition that the vectors are the same length.
In R, you rather iterate based on the indices than on vectors directly:
for (i in 1:(min(length(foo), length(bar)))){
print(foo[i], bar[i])
}
Another option is to use mapply. This wouldn't make a lot of sense for printing, but I'm assuming you have an interest in doing this for something more interesting than print
foo <- c('a','c','d')
bar <- c('aa','cc','dd')
invisible(
mapply(function(f, b){ print(c(f, b))},
foo, bar)
)
Maybe someone arriving based on the title makes good use of this:
foo<-LETTERS[1:10]
bar<-LETTERS[1:3]
i = 0
for (j in 1:length(foo)){
i = i + 1
if (i > length(bar)){
i = 1
}
print(paste(foo[j],bar[i]) )
}
[1] "A A"
[1] "B B"
[1] "C C"
[1] "D A"
[1] "E B"
[1] "F C"
[1] "G A"
[1] "H B"
[1] "I C"
[1] "J A"
which is "equivalent" to: (using for eases assignments)
suppressWarnings(invisible(
mapply(function(x, y){
print(paste(x, y))},
foo, bar)
))

Accessing element of a split string in R

If I have a string,
x <- "Hello World"
How can I access the second word, "World", using string split, after
x <- strsplit(x, " ")
x[[2]] does not do anything.
As mentioned in the comments, it's important to realise that strsplit returns a list object. Since your example is only splitting a single item (a vector of length 1) your list is length 1. I'll explain with a slightly different example, inputting a vector of length 3 (3 text items to split):
input <- c( "Hello world", "Hi there", "Back at ya" )
x <- strsplit( input, " " )
> x
[[1]]
[1] "Hello" "world"
[[2]]
[1] "Hi" "there"
[[3]]
[1] "Back" "at" "ya"
Notice that the returned list has 3 elements, one for each element of the input vector. Each of those list elements is split as per the strsplit call. So we can recall any of these list elements using [[ (this is what your x[[2]] call was doing, but you only had one list element, which is why you couldn't get anything in return):
> x[[1]]
[1] "Hello" "world"
> x[[3]]
[1] "Back" "at" "ya"
Now we can get the second part of any of those list elements by appending a [ call:
> x[[1]][2]
[1] "world"
> x[[3]][2]
[1] "at"
This will return the second item from each list element (note that the "Back at ya" input has returned "at" in this case). You can do this for all items at once using something from the apply family. sapply will return a vector, which will probably be good in this case:
> sapply( x, "[", 2 )
[1] "world" "there" "at"
The last value in the input here (2) is passed to the [ operator, meaning the operation x[2] is applied to every list element.
If instead of the second item, you'd like the last item of each list element, we can use tail within the sapply call instead of [:
> sapply( x, tail, 1 )
[1] "world" "there" "ya"
This time, we've applied tail( x, 1 ) to every list element, giving us the last item.
As a preference, my favourite way to apply actions like these is with the magrittr pipe, for the second word like so:
x <- input %>%
strsplit( " " ) %>%
sapply( "[", 2 )
> x
[1] "world" "there" "at"
Or for the last word:
x <- input %>%
strsplit( " " ) %>%
sapply( tail, 1 )
> x
[1] "world" "there" "ya"
Another approach that might be a little easier to read and apply to a data frame within a pipeline (though it takes more lines) would be to wrap it in your own function and apply that.
library(tidyverse)
df <- data.frame(
greetings = c( "Hello world", "Hi there", "Back at ya" )
)
split_params = function (x, sep, n) {
# Splits string into list of substrings separated by 'sep'.
# Returns nth substring.
x = strsplit(x, sep)[[1]][n]
return(x)
}
df = df %>%
mutate(
'greetings' = sapply(
X = greetings,
FUN = split_params,
# Arguments for split_params.
sep = ' ',
n = 2
)
)
df
### (Output in RStudio Notebook)
greetings second_word
<chr> <chr>
Hello world world
Hi there there
Back at ya at
3 rows
###
With stringr 1.5.0, you can use str_split_i to access the ith element of a split string:
library(stringr)
x <- "Hello World"
str_split_i(x, " ", i = 2)
#[1] "World"
It is vectorized:
x <- c("Hello world", "Hi there", "Back at ya")
str_split_i(x, " ", 2)
#[1] "world" "there" "at"
x=strsplit("a;b;c;d",";")
x
[[1]]
[1] "a" "b" "c" "d"
x=as.character(x[[1]])
x
[1] "a" "b" "c" "d"
x=strsplit(x," ")
x
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
[[4]]
[1] "d"

Splitting a string in R

Consider:
x<-strsplit("This is an example",split="\\s",fixed=FALSE)
I am surprised to see that x has length 1 rather than length 4:
> length(x)
[1] 1
like this, x[3] is null. But If I unlist, then:
> x<-unlist(x)
> x
[1] "This" "is" "an" "example"
> length(x)
[1] 4
only now x[3] is "an".
Why wasn't that list originally by length 4 so that elements can be accessed by indexing? This gives troubles to access the splitted elements, since I have to unlist first.
This allows strsplit to be vectorized for its input argument. For instance, it will allow you to split a vector such as:
x <- c("string one", "string two", "and string three")
into a list of split results.
You do not need to unlist, but rather, you can refer to the element by a combination of its list index and the vector index. For instance, if you wanted to get the second word in the second item, you can do:
> x <- c("string one", "string two", "and string three")
> y <- strsplit(x, "\\s")
> y[[2]][2]
[1] "two"
That's because strsplit generates a list containing each element (word).
Try
> x[[1]]
#[1] "This" "is" "an" "example"
and
> length(x[[1]])
#[1] 4

Resources