How to rename list objects in self defined function?

How to rename list objects in self defined function? - r

quote <- function(namefoo, namebar){
set.seed(3)
foo <- rnorm(n = 5)
bar <- rnorm(n = 5)
return(list(namefoo=foo,namebar=bar))
}
From the above function, If I ran quote(test, test1) the name of the two objects in the list remain as namefoo and namebar instead of what I specified in the function call.
If I just ran the code seperately as:
set.seed(3)
foo <- rnorm(n = 5)
bar <- rnorm(n = 5)
obj <- list(test=foo,test1=bar)
Then obj will return foo and bar with the amended names. How do I make my function do this? I've tried several combinations of including quotes as well, from the function call to the function itself but it doesn't seem to work.

One way is this:
quote <- function(namefoo, namebar){
set.seed(3)
foo <- rnorm(n = 5)
bar <- rnorm(n = 5)
out <- list(foo, bar)
names(out) <- c(namefoo, namebar)
out
}
You can save the list to a variable and then name the elements with names.
# quote('foo', 'bar')
# $namefoo
# [1] -0.9619334 -0.2925257 0.2587882 -1.1521319
# [5] 0.1957828
#
# $namebar
# [1] 0.03012394 0.08541773 1.11661021
# [4] -1.21885742 1.26736872

It's a very bad idea to name your function quote, a very important R function is named just like that.
Use setNames :
fun <- function(namefoo, namebar){
set.seed(3)
foo <- rnorm(n = 5)
bar <- rnorm(n = 5)
setNames(list(foo,bar),c(namefoo, namebar))
}
fun("hi","there")
# $hi
# [1] -0.9619334 -0.2925257 0.2587882 -1.1521319 0.1957828
#
# $there
# [1] 0.03012394 0.08541773 1.11661021 -1.21885742 1.26736872
You might also see this kind of code around too, using more advanced features of rlang / tidyverse :
library(tidyverse)
fun2 <- function(namefoo, namebar){
set.seed(3)
foo <- rnorm(n = 5)
bar <- rnorm(n = 5)
lst(!!namefoo := foo,!!namebar := bar)
}
fun2("hi","there")
# $hi
# [1] -0.9619334 -0.2925257 0.2587882 -1.1521319 0.1957828
#
# $there
# [1] 0.03012394 0.08541773 1.11661021 -1.21885742 1.26736872

We can do
quotefn <- function(...) {
nm <- c(...)
out <- replicate(length(nm), rnorm(n = 5), simplify = FALSE)
names(out) <- nm
out}
quotefn("foo", "bar")
#$foo
#[1] -0.5784837 -0.9423007 -0.2037282 -1.6664748 -0.4844551
#$bar
#[1] -0.74107266 1.16061578 1.01206712 -0.07207847 -1.13678230

Related

Function to apply mean on a list of vectors without need to list the vectors (changes to a function call)

I have multiple objects and I need to apply some function to them, in my example mean. But the function call shouldn't include list, it must look like this: my_function(a, b, d).
Advise how to do it please, probably I need quote or substitute, but I'm not sure how to use them.
a <- c(1:15)
b <- c(1:17)
d <- c(1:19)
my_function <- function(objects) {
lapply(objects, mean)
}
my_function(list(a, b, d))

A possible solution:
a <- c(1:15)
b <- c(1:17)
d <- c(1:19)
my_function <- function(...) {
lapply(list(...), mean)
}
my_function(a, b, d)
#> [[1]]
#> [1] 8
#>
#> [[2]]
#> [1] 9
#>
#> [[3]]
#> [1] 10

To still be able to benefit from the other arguments of mean such as na.rm= and trim=, i.e. to generalize, we may match the formalArgs with the dots and split the call accordingly.
my_function <- function(...) {
cl <- match.call()
m <- match(formalArgs(base:::mean.default), names(cl), 0L)
vapply(as.list(cl)[-c(1L, m)], function(x) {
eval(as.call(c(quote(base:::mean.default), list(x), as.list(cl[m]))))
}, numeric(1L))
}
## OP's example
my_function(a, b, d)
# [1] 8 9 10
## generalization:
set.seed(42)
my_function(rnorm(12), rnorm(5), c(NA, rnorm(3)))
# [1] 0.7553736 -0.2898547 NA
set.seed(42)
my_function(rnorm(12), rnorm(5), c(NA, rnorm(3)), na.rm=TRUE)
# 0.7553736 -0.2898547 -1.2589363
set.seed(42)
my_function(rnorm(12), rnorm(5), c(NA, rnorm(3)), na.rm=TRUE, trim=.5)
# 0.5185655 -0.2787888 -2.4404669
Data:
a <- 1:15; b <- 1:17; d <- 1:19

For loop: paste index into string

This may strike you as odd, but I want to exactly achieve the following: I want to get the index of a list pasted into a string containing a string reference to a subset of this list.
For illustration:
l1 <- list(a = 1, b = 2)
l2 <- list(a = 3, b = 4)
l <- list(l1,l2)
X_l <- vector("list", length = length(l))
for (i in 1:length(l)) {
X_l[[i]] = "l[[ #insert index number as character# ]]$l_1*a"
}
In the end, I want something like this:
X_l_wanted <- list("l[[1]]$l_1*a","l[2]]$l_1*a")

You can use sprintf/paste0 directly :
sprintf('l[[%d]]$l_1*a', seq_along(l))
#[1] "l[[1]]$l_1*a" "l[[2]]$l_1*a"
If you want final output as list :
as.list(sprintf('l[[%d]$l_1*a', seq_along(l)))
#[[1]]
#[1] "l[[1]]$l_1*a"
#[[2]]
#[1] "l[[2]]$l_1*a"
Using paste0 :
paste0('l[[', seq_along(l), ']]$l_1*a')

Try paste0() inside your loop. That is the way to concatenate chains. Here the solution with slight changes to your code:
#Data
l1 <- list(a = 1, b = 2)
l2 <- list(a = 3, b = 4)
l <- list(l1,l2)
#List
X_l <- vector("list", length = length(l))
#Loop
for (i in 1:length(l)) {
#Code
X_l[[i]] = paste0('l[[',i,']]$l_1*a')
}
Output:
X_l
[[1]]
[1] "l[[1]]$l_1*a"
[[2]]
[1] "l[[2]]$l_1*a"

Or you could do it with lapply()
library(glue)
X_l <- lapply(1:length(l), function(i)glue("l[[{i}]]$l_l*a"))
X_l
# [[1]]
# l[[1]]$l_l*a
# [[2]]
# l[[2]]$l_l*a

Find variables that occur only in one cluster in data.frame in R

Using BASE R, I wonder how to answer the following question:
Are there any value on X or Y (i.e., variables of interest names) that occurs only in one element in m (as a cluster) but not others? If yes, produce my desired output below.
For example:
Here we see X == 3 only occurs in element m[[3]] but not m[[1]] and m[[2]].
Here we also see Y == 99 only occur in m[[1]] but not others.
Note: the following is a toy example, a functional answer is appreciated. AND X & Y may or may not be numeric (e.g., be string).
f <- data.frame(id = c(rep("AA",4), rep("BB",2), rep("CC",2)), X = c(1,1,1,1,1,1,3,3),
Y = c(99,99,99,99,6,6,6,6))
m <- split(f, f$id) # Here is `m`
mods <- names(f)[-1] # variables of interest names
Desired output:
list(AA = c(Y = 99), CC = c(X = 3))
# $AA
# Y
# 99
# $CC
# X
# 3

This is a solution based on rapply() and table().
ux <- rapply(m, unique)
tb <- table(uxm <- ux[gsub(rx <- "^.*\\.(.*)$", "\\1", names(ux)) %in% mods])
r <- Map(setNames, n <- uxm[uxm %in% names(tb)[tb == 1]], gsub(rx, "\\1", names(n)))
setNames(r, gsub("^(.*)\\..*$", "\\1", names(r)))
# $AA
# Y
# 99
#
# $CC
# X
# 3

tmp = do.call(rbind, lapply(names(f)[-1], function(x){
d = unique(f[c("id", x)])
names(d) = c("id", "val")
transform(d, nm = x)
}))
tmp = tmp[ave(as.numeric(as.factor(tmp$val)), tmp$val, FUN = length) == 1,]
lapply(split(tmp, tmp$id), function(a){
setNames(a$val, a$nm)
})
#$AA
# Y
#99
#$BB
#named numeric(0)
#$CC
#X
#3

This utilizes #jay.sf's idea of rapply() with an idea from a previous answer:
vec <- rapply(lapply(m, '[', , mods), unique)
unique_vec <- vec[!duplicated(vec) & !duplicated(vec, fromLast = T)]
vec_names <- do.call(rbind, strsplit(names(unique_vec), '.', fixed = T))
names(unique_vec) <- vec_names[, 2]
split(unique_vec, vec_names[, 1])
$AA
Y
99
$CC
X
3

Calling setdiff() on multiple vectors

How can I use setdiff() in R to get the elements that are in one vector but not in the others My example is as follows:
dat1 <- c("osa", "bli", "usd", "mnl")
dat2 <- c("mnu", "erd", "usd", "mnl")
dat3 <- c("ssu", "erd", "usd", "mnl")
The following code only returns what is diffrent in dat1 compared to dat2 and dat3:
diffs <- Reduce(setdiff,
list(A = dat1,
B = dat2,
C = dat3
)
How can I modify this code to be able to get all the elements that are uniquely present in on vector compared to the other? Thanks

another solution using setdiff :
myl <- list(A = dat1,
B = dat2,
C = dat3)
lapply(1:length(myl), function(n) setdiff(myl[[n]], unlist(myl[-n])))
[[1]]
[1] "osa" "bli"
[[2]]
[1] "mnu"
[[3]]
[1] "ssu"

a second possibility :
f <- function (...)
{
aux <- list(...)
ind <- rep(1:length(aux), sapply(aux, length))
x <- unlist(aux)
boo <- !(duplicated(x) | duplicated(x, fromLast = T))
split(x[boo], ind[boo])
}
f(dat1, dat2, dat3)
$`1`
[1] "osa" "bli"
$`2`
[1] "mnu"
$`3`
[1] "ssu"

Try this:
all.dat <- list(dat1, dat2, dat3)
from.dat <- rep(seq_along(all.dat), sapply(all.dat, length))
in.dat <- split(from.dat, unlist(all.dat))
in.one.dat <- in.dat[sapply(in.dat, length) == 1]
in.one.dat
# $bli
# [1] 1
# $mnu
# [1] 2
# $osa
# [1] 1
# $ssu
# [1] 3
which tells you what items are found in only one of the dat objects, and which one. If you only care for the names, then finish with: names(in.one.dat).

Function argument as a part of the output name

Perhaps a silly question, but I can't find any answers to it anywhere (that I've looked :P ). I am trying to create a function with two arguments, these will be vectors (e.g.x=c(a,b,c) and y=c(50,75,100)). I will write a function which calculates all the combinations of these and have the argument used as a part of the output name. E.g.
function(x,y)
df$output_a_50 = a*2+50^2
df$output_a_75 = a*2+75^2
.....
Any suggestions will be appreciated :)

As #Spacedman and others discussed, your problem is that if you pass c(a, b, c) to your function, the names will be lost. The best alternative in my opinion, is to pass a list:
foo <- function(x, y) {
df <- list()
for (xx in names(x)) {
for (yy in y) {
varname <- paste("output", xx, yy, sep = "_")
df[[varname]] <- x[[xx]]*2 + yy^2
}
}
df
}
foo(x = list(a = NA, b = 1, c = 2:3),
y = c(50, 75, 100))
# $output_a_50
# [1] NA
#
# $output_a_75
# [1] NA
#
# $output_a_100
# [1] NA
#
# $output_b_50
# [1] 2502
#
# $output_b_75
# [1] 5627
#
# $output_b_100
# [1] 10002
#
# $output_c_50
# [1] 2504 2506
#
# $output_c_75
# [1] 5629 5631
#
# $output_c_100
# [1] 10004 10006

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to rename list objects in self defined function? - r

We can do quotefn <- function(...) { nm <- c(...) out <- replicate(length(nm), rnorm(n = 5), simplify = FALSE) names(out) <- nm out} quotefn("foo", "bar") #$foo #[1] -0.5784837 -0.9423007 -0.2037282 -1.6664748 -0.4844551 #$bar #[1] -0.74107266 1.16061578 1.01206712 -0.07207847 -1.13678230

Related

Function to apply mean on a list of vectors without need to list the vectors (changes to a function call)

For loop: paste index into string

Find variables that occur only in one cluster in data.frame in R

Calling setdiff() on multiple vectors

Function argument as a part of the output name

Categories

Resources