appending to a list with dynamic names - r

I have a list in R:
a <- list(n1 = "hi", n2 = "hello")
I would like to append to this named list but the names must be dynamic. That is, they are created from a string (for example: paste("another","name",sep="_")
I tried doing this, which does not work:
c(a, parse(text="paste(\"another\",\"name\",sep=\"_\")=\"hola\"")
What is the correct way to do this? The end goal is just to append to this list and choose my names dynamically.

You could just use indexing with double brackets. Either of the following methods should work.
a <- list(n1 = "hi", n2 = "hello")
val <- "another name"
a[[val]] <- "hola"
a
#$n1
#[1] "hi"
#
#$n2
#[1] "hello"
#
#$`another name`
#[1] "hola"
a[[paste("blah", "ok", sep = "_")]] <- "hey"
a
#$n1
#[1] "hi"
#
#$n2
#[1] "hello"
#
#$`another name`
#[1] "hola"
#
#$blah_ok
#[1] "hey"

You can use setNames to set the names on the fly:
a <- list(n1 = "hi", n2 = "hello")
c(a,setNames(list("hola"),paste("another","name",sep="_")))
Result:
$n1
[1] "hi"
$n2
[1] "hello"
$another_name
[1] "hola"

Related

R - Merge unique values from two lists using stringr::str_split

I have a function, that when given a list of strings, should return a vector of all unique strings of N size.
get_unique <- function (input_list, size = 3) {
output = c()
for (input in input_list) {
current = stringr::str_replace(input, "[-_\\s]", "")
current = trimws(gsub(paste0("(.{",size,"})"), "\\1 ", current))
parts = stringr::str_split(current, "\\s", simplify = TRUE)[1,]
output = union(output, parts)
}
return(output)
}
The expectation I have would be:
get_unique(c("ABC", "ABCDEF", "GHIDEF"))
[1] "ABC" "DEF" "GHI"
But what I get is:
get_unique(c("ABC", "ABCDEF", "GHIDEF"))
[[1]]
[1] "ABC"
[[2]]
[1] "DEF"
[[3]]
[1] "GHI"
I'm fairly new to R, so I'm having a tough time understanding where I've gone wrong.
We can use unlist at the end
get_unique <- function (input_list, size = 3) {
output = c()
for (input in input_list) {
current = stringr::str_replace(input, "[-_\\s]", "")
current = trimws(gsub(paste0("(.{",size,"})"), "\\1 ", current))
parts = stringr::str_split(current, "\\s", simplify = TRUE)[1,]
output = union(output, parts)
}
return(unlist(output))
}
get_unique(c("ABC", "ABCDEF", "GHIDEF"))
#[1] "ABC" "DEF" "GHI"
We could also do this in a single line with a regex lookaround to split at every 3 character
unique(unlist(strsplit(v1, "(?<=...)", perl = TRUE)))
#[1] "ABC" "DEF" "GHI"
data
v1 <- c("ABC", "ABCDEF", "GHIDEF")
full on baseR solution, using substr:
get_unique <- function(v) unique(unlist(sapply(v, function(x) sapply(1:(nchar(x)/3), function(y) substr(x, 3*(y-1)+1, 3*y) ))))
get_unique(v1)
[1] "ABC" "DEF" "GHI"
substr(x, 3*(y-1)+1, 3*y) grab 3 characters substrings from x.

Assign values with same name in a nested list in R

I want to assign the same value to specific elements in nested lists that have the same name. I would also like to create the element if it didn't exist in the nested list, but this isn't shown in my example.
For example, let's say I have:
ls <- list(a = list(e1 = "value1.1", e2 = "value1.2"),
b = list(e1 = "value2.1", e2 = "value2.2"))
And I want to assign the value "same value" to all the elements in sublists that are named e1 so that I would end up with this desired output:
list(a = list(e1 = "same value", e2 = "value1.2"),
b = list(e1 = "same value", e2 = "value2.2")
> ls
$a
$a$e1
[1] "same value"
$a$e2
[1] "value1.2"
$b
$b$e1
[1] "same value"
$b$e2
[1] "value2.2"
After researches, I found the function modify_depth() in the package purrr that will apply a function to the nested lists, but won't assign a value.
My only other solution was to do this:
ls2 <- list()
for(sublist in ls){
sublist[["e1"]] <- "same value"
ls2 <- c(ls2, list(sublist))
}
names(ls2) <- names(ls)
ls <- ls2
rm(ls2)
Note: after the loop, "same value" are not assigned in ls, so I have to create ls2. I could make this a function, but I'm sure there is a better way to do this, without a for loop.
Thanks for your help!
With map and replace from purrr:
library(purrr)
map(ls, ~replace(., "e1", "same value"))
or with modify_depth:
modify_depth(ls, 1, ~replace(., "e1", "same value"))
replace also works with lapply:
lapply(ls, replace, "e1", "same value")
Output:
$a
$a$e1
[1] "same value"
$a$e2
[1] "value1.2"
$b
$b$e1
[1] "same value"
$b$e2
[1] "value2.2"
The good thing about modify_depth is that you can choose the depth, whereas map and lapply only goes down one level:
ls2 <- list(c = list(a1 = list(e1 = "value1.1", e2 = "value1.2")),
d = list(a1 = list(e1 = "value2.1", e2 = "value2.2")))
modify_depth(ls2, 2, ~replace(., "e1", "same value"))
Output:
$c
$c$a1
$c$a1$e1
[1] "same value"
$c$a1$e2
[1] "value1.2"
$d
$d$a1
$d$a1$e1
[1] "same value"
$d$a1$e2
[1] "value2.2"
Edit: If we want to apply a vector of values to each e1 element of ls2, modify_depth would not work since it does not have a map2 variant. Instead, I would use two layers of map's, map2 on the top layer, and map on the second layer.
vec <- c("newvalue1.1", "newvalue2.1")
map2(ls2, vec, ~map(.x, replace, "e1", .y))
Output:
$c
$c$a1
$c$a1$e1
[1] "newvalue1.1"
$c$a1$e2
[1] "value1.2"
$d
$d$a1
$d$a1$e1
[1] "newvalue2.1"
$d$a1$e2
[1] "value2.2"
We can use lapply to iterate over list elements, then using [ with logical operator for equality == we can locate the values we want to change
> lapply(ls, function(x){
x[names(x)=="e1"] <- "same.value"
x
} )
$`a`
$`a`$`e1`
[1] "same.value"
$`a`$e2
[1] "value1.2"
$b
$b$`e1`
[1] "same.value"
$b$e2
[1] "value2.2"
ls is reserved in R so I suggest to use another name for the list, my alternative is:
y <- list(a = list(e1 = "value1.1", e2 = "value1.2"),
b = list(e1 = "value2.1", e2 = "value2.2"))
lapply(y, function(x) replace(x, list = "e1", values = "new value"))
$a
$a$e1
[1] "new value"
$a$e2
[1] "value1.2"
$b
$b$e1
[1] "new value"
$b$e2
[1] "value2.2"

all strings of length k that can be formed from a set of n characters

This question has been asked for other languages but I'm looking for the most idiomatic way to find all strings of length k that can be formed from a set of n characters in R
Example input and output:
input <- c('a', 'b')
output <- c('aa', 'ab', 'ba', 'bb')
A little more complicated than I'd like. I think outer() only works for n=2. combn doesn't include repeats.
allcomb <- function(input = c('a', 'b'), n=2) {
args <- rep(list(input),n)
gr <- do.call(expand.grid,args)
return(do.call(paste0,gr))
}
Thanks to #thelatemail for improvements ...
allcomb(n=4)
## [1] "aaaa" "baaa" "abaa" "bbaa" "aaba" "baba" "abba"
## [8] "bbba" "aaab" "baab" "abab" "bbab" "aabb" "babb"
## [15] "abbb" "bbbb"
Adapting AK88's answer, outer can be used for arbitrary values of k, although it's not necessarily the most efficient solution:
input <- c('a', 'b')
k = 5
perms = input
for (i in 2:k) {
perms = outer(perms, input, paste, sep="")
}
result = as.vector(perms)
m <- outer(input, input, paste, sep="")
output = as.vector(m)
## "aa" "ba" "ab" "bb"
I'm not proud of how this looks, but it works...
allcombs <- function(x, k) {
apply(expand.grid(split(t(replicate(k, x)), seq_len(k))), 1, paste, collapse = "")
}
allcombs(letters[1:2], 2)
#> [1] "aa" "ba" "ab" "bb"
allcombs(letters[1:2], 4)
#> [1] "aaaa" "baaa" "abaa" "bbaa" "aaba" "baba" "abba" "bbba" "aaab" "baab"
#> [11] "abab" "bbab" "aabb" "babb" "abbb" "bbbb"

Accessing element of a split string in R

If I have a string,
x <- "Hello World"
How can I access the second word, "World", using string split, after
x <- strsplit(x, " ")
x[[2]] does not do anything.
As mentioned in the comments, it's important to realise that strsplit returns a list object. Since your example is only splitting a single item (a vector of length 1) your list is length 1. I'll explain with a slightly different example, inputting a vector of length 3 (3 text items to split):
input <- c( "Hello world", "Hi there", "Back at ya" )
x <- strsplit( input, " " )
> x
[[1]]
[1] "Hello" "world"
[[2]]
[1] "Hi" "there"
[[3]]
[1] "Back" "at" "ya"
Notice that the returned list has 3 elements, one for each element of the input vector. Each of those list elements is split as per the strsplit call. So we can recall any of these list elements using [[ (this is what your x[[2]] call was doing, but you only had one list element, which is why you couldn't get anything in return):
> x[[1]]
[1] "Hello" "world"
> x[[3]]
[1] "Back" "at" "ya"
Now we can get the second part of any of those list elements by appending a [ call:
> x[[1]][2]
[1] "world"
> x[[3]][2]
[1] "at"
This will return the second item from each list element (note that the "Back at ya" input has returned "at" in this case). You can do this for all items at once using something from the apply family. sapply will return a vector, which will probably be good in this case:
> sapply( x, "[", 2 )
[1] "world" "there" "at"
The last value in the input here (2) is passed to the [ operator, meaning the operation x[2] is applied to every list element.
If instead of the second item, you'd like the last item of each list element, we can use tail within the sapply call instead of [:
> sapply( x, tail, 1 )
[1] "world" "there" "ya"
This time, we've applied tail( x, 1 ) to every list element, giving us the last item.
As a preference, my favourite way to apply actions like these is with the magrittr pipe, for the second word like so:
x <- input %>%
strsplit( " " ) %>%
sapply( "[", 2 )
> x
[1] "world" "there" "at"
Or for the last word:
x <- input %>%
strsplit( " " ) %>%
sapply( tail, 1 )
> x
[1] "world" "there" "ya"
Another approach that might be a little easier to read and apply to a data frame within a pipeline (though it takes more lines) would be to wrap it in your own function and apply that.
library(tidyverse)
df <- data.frame(
greetings = c( "Hello world", "Hi there", "Back at ya" )
)
split_params = function (x, sep, n) {
# Splits string into list of substrings separated by 'sep'.
# Returns nth substring.
x = strsplit(x, sep)[[1]][n]
return(x)
}
df = df %>%
mutate(
'greetings' = sapply(
X = greetings,
FUN = split_params,
# Arguments for split_params.
sep = ' ',
n = 2
)
)
df
### (Output in RStudio Notebook)
greetings second_word
<chr> <chr>
Hello world world
Hi there there
Back at ya at
3 rows
###
With stringr 1.5.0, you can use str_split_i to access the ith element of a split string:
library(stringr)
x <- "Hello World"
str_split_i(x, " ", i = 2)
#[1] "World"
It is vectorized:
x <- c("Hello world", "Hi there", "Back at ya")
str_split_i(x, " ", 2)
#[1] "world" "there" "at"
x=strsplit("a;b;c;d",";")
x
[[1]]
[1] "a" "b" "c" "d"
x=as.character(x[[1]])
x
[1] "a" "b" "c" "d"
x=strsplit(x," ")
x
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
[[4]]
[1] "d"

R: Apostrophes in recode()

I am using the recode() function in the car package to recode an integer class variable in a data frame. I am trying to recode one of the values of the variable to a string that contains a single apostrophe ('). However, this does not work. I imagine it is because the single apostrophe prematurely ends assignment. So, I tried to use \' to exit the function but it doesn't work either.
I would prefer to continue using recode() but if that is not an option, alternatives are welcome.
A working example:
# Load car() and dplyr()
library(car)
library(dplyr)
# Set up df
a <- seq(1:3)
b <- rep(9,3)
df <- cbind(a,b) %>% as.data.frame(.)
# Below works because none of the recoding includes an apostrophe:
recode(df$a, "1 = 'foo'; 2 = 'bar'; 3 = 'foobar'")
# Below doesn't work due to apostrophe in foofoo's:
recode(df$a, "1 = 'foo'; 2 = 'bar'; 3 = 'foofoo's'")
# Exiting doesn't fix it:
recode(df$a, "1 = 'foo'; 2 = 'bar'; 3 = 'foofoo\'s'")
We could escape the quotes to make it work
recode(df$a, "1 = \"foo\"; 2 = \"bar\"; 3 = \"foofoo's\"")
#[1] "foo" "bar" "foobar's"
A base R alternative would be to use the df$a values as numeric index to replace those values
df$a <- c("foo", "bar", "foobar's")[df$a]
df$a
#[1] "foo" "bar" "foobar's"
Suppose if the values are not numeric and not in the sequence.
set.seed(24)
v1 <- sample(LETTERS[1:3], 10, replace=TRUE)
v1
#[1] "A" "A" "C" "B" "B" "C" "A" "C" "C" "A"
as.vector(setNames(c("foo", "bar", "foobar's"), LETTERS[1:3])[v1])
#[1] "foo" "foo" "foobar's" "bar" "bar" "foobar's"
#[7] "foo" "foobar's" "foobar's" "foo"
Here, we replace "A" with "foo", "B" with "bar" and "C" with "foobar's". To do that, create a named key/value vector to replace values in 'v1'.

Resources