Extract names from a list using map() function - r

I already have a list contains all functions in dplyr by using this code
content <- mget(ls("package:dplyr"), inherits = TRUE)
dplyr_functions <- Filter(is.function, content)
The result I wanna get is just like the result of
names(dplyr_functions)
It will be a chr vector containing all function names in dplyr package.
But when I use map() function, my code is like:
dplyr_name <- map_chr(dplyr_functions, names)
There is an error said,
"Result 1 must be a single string, not NULL of length 0"
So I just want to know what the error mean? How can I use map_chr to get a vector containing all names in dplyr_functions?

map loop through the list element's content "value" e.g. dplyr_functions[[1]] and so on, not through the element as in dplyr_functions[1], try both to see the difference. Hence names(dplyr_functions[[1]]) returns NULL and map_chr fails, while names(dplyr_functions[1]) returns %>% and map_chr could work.
So we can loop through the list index and subset using the 2nd method or use imap which designed to loop through the list names.
library(purrr)
map_chr(seq_along(dplyr_functions), ~names(dplyr_functions[.x]))
#or
imap_chr(dplyr_functions, ~.y) %>% unname()

Related

When creating new data.frame column, what is the difference between `df$NewCol=` and `df[,"NewCol"]=` methods?

Using the default "iris" DataFrame in R, how come when creating a new column "NewCol"
iris[,'NewCol'] = as.POSIXlt(Sys.Date()) # throws Warning
BUT
iris$NewCol = as.POSIXlt(Sys.Date()) # is correct
This issue doesn't exist when assigning Primitive types like chr, int, float, ....
First, notice as #sindri_baldur pointed, as.POSIXlt returns a list.
From R help ($<-.data.frame):
There is no data.frame method for $, so x$name uses the default method which treats x as a list (with partial matching of column names if the match is unique, see Extract). The replacement method (for $) checks value for the correct number of rows, and replicates it if necessary.
So, if You try iris[, "NewCol"] <- as.POSIClt(Sys.Date()) You get warning that You're trying assign a list object to a vector. So only the first element of the list is used.
Again, from R help:
"For [ the replacement value can be a list: each element of the list is used to replace (part of) one column, recycling the list as necessary".
And in Your case, only one column is specified meaning only the first element of the as.POSIXlt's result (list) will be used. And You are warned of that.
Using $ syntax the iris data.frame is treated as a list and then the result of as.POSIXlt - a list again - is appended to it. Finally, the result is data.frame, but take a look at the type of the NewCol - it's a list.
iris[, "NewCol"] <- as.POSIXlt(Sys.Date()) # warning
iris$NewCol2 <- as.POSIXlt(Sys.Date())
typeof(iris$NewCol) # double
typeof(iris$NewCol2) # list
Suggestion: maybe You wanted to use as.POSIXct()?

Modify elipsis in R

I have a problem with elipsis usecase. My function accepts list of objects, let's call them objects of class "X". Now, objects X are being processed inside of my function to class "Xs", so I have list of "Xs" objects. Function that I import from other package can compute multiple "Xs" objects at once but they have to be enumerated (elipsis mechanic), not passed as list. Is there a way how to solve it? I want something like this
examplefun <- function(charlist){
nums <- lapply(charlist, as.numeric)
sum(... = nums)
}
Of course example above throws an error but it shows what i want to achieve. I tried to unlist with recursive = FALSE ("X" and "Xs" are the list itself) but it does not work.
If there is no solution then:
Let's assume I decideed to accept ... insted of list of "X" objects. Can I modify elipsis elements (change them to "Xs") and then pass to function that accepts elipsis? So it will look like this:
examplefun2 <- function(...){
function that modify object in ... to "Xs" objects
sum(...)
}
In your first function, just call sum directly because sum works correctly on vectors of numbers instead of individual numbers.
examplefun <- function (charlist) {
nums <- vapply(charlist, as.numeric, numeric(1L))
sum(nums)
}
(Note the use of vapply instead of lapply: sum expects an atomic vector, we can’t pass a list.)
In your second function, you can capture ... and work with the captured variable:
examplefun2 <- function (...) {
nums <- as.numeric(c(...))
sums(nums)
}
For more complex arguments, Roland’s comment is a good alternative: Modify the function arguments as a list, and pass it to do.call.

get names of quoted list without evaluation

I've got a quoted list
quote(list(orders = .N,
total_quantity = sum(quantity)))
(that I eventually eval in the j part of a data.table)
What I would like is to extract the names of that list without having to evaluate the expression because outside of the correct environment evaluating the expression will produce an error.
The list doesn't have any names at that point. It's not even a list. It's a call to the list() function. But that said you can parse that function call and extract name parameter. For example
x <- quote(list(orders = .N,
total_quantity = sum(quantity)))
names(as.list(x))[-1]
# [1] "orders" "total_quantity"
That as.list() on the expression turns the function call into a (named) list without evaluation.

Generating named lists in R

If I want to create a named list, where I have named literals, I can just do this:
list(foo=1,bar=2,baz=3)
If instead I want to make a list with arbitrary computation, I can use lapply, so for example:
lapply(list(1,2,3), function(x) x)
However, the list generated by lapply will always be a regular numbered list. Is there a way I can generate a list using a function like lapply with names.
My idea is something along the lines of:
lapply(list("foo","bar","baz), function(key) {key=5}
==>
list(foo=5,bar=5,baz=5)
That way I don't have to have the keys and values as literals.
I do know that I could do this:
res = list()
for(key in list("foo","bar","baz") {
res[key] <- 5;
}
But I don't like how I have to create a empty list and mutate it to fill it out.
Edit: I would also like to do some computation based on the key. Something like this:
lapply(c("foo","bar","baz"), function(key) {paste("hello",key)=5})
sapply will use its argument for names if it is a character vector, so you can try:
sapply(c("foo","bar","baz"), function(key) 5, simplify=F)
Which produces:
$foo
[1] 5
$bar
[1] 5
$baz
[1] 5
If your list has names in the first place, lapply will preserve them
lapply(list(a=1,b=2,c=3), function(x) x)
or you can set names before or after with setNames()
#before
lapply(setNames(list(1,2,3),c("foo","bar","baz")), function(x) x)
#after
setNames(lapply(list(1,2,3), function(x) x), c("foo","bar","baz"))
One other "option" is Map(). Map will try to take the names from the first parameter you pass in. You can ignore the value in the function and use it only for the side-effect of keeping the name
Map(function(a,b) 5, c("foo","bar","baz"), list(1:3))
But names cannot be changed during lapply/Map steps. They can only be copied from another location. if you need to mutate names, you'll have to do that as a separate step.

Different beheavior of get and mget in aggregation (R)

I have an character array (chr [1:5] named keynn) of column names on which I would like to perform an aggregation.
All elements of the array is a valid column name of the data frame (mydata), but it is a string and not the variable ("YEAR" instead of mydata$YEAR).
I tried using get() to return the column from the name and it works, for the first element, like so:
attach(mydata)
aggregate(mydata, by=list(get(keynn, .GlobalEnv)), FUN=length)
I tried using mget() since my array as more than one element, like this:
attach(mydata)
aggregate(mydata, by=list(mget(keynn, .GlobalEnv)), FUN=length)
but I get an error:
value for 'YEAR' not found.
How can I get the equivalent of get for multiple columns to aggregate by?
Thank you!
I would suggest not using attach in general
If you are just trying to get columns from mydata you can use [ to index the list
aggregate(mydata, by = mydata[keynn], FUN = length)
should work -- and is very clear that you want to get keynn from mydata
The problem with using attach is that it adds mydata to the search path (not copying to the global environment)
try
attach(mydata)
mget(keynn, .GlobalEnv)
so if you were to use mget and attach, you need
mget(keynn, .GlobalEnv, inherits = TRUE)
so that it will not just search in the global environment.
But that is more effort than it is worth (IMHO)
The reason get works is that inherits = TRUE by default. You could thus use lapply(keynn, get) if mydata were attached, but again this ugly and unclear about what it is doing.
another approach would be to use data.table, which will evaluate the by argument within the data.table in question
library(data.table)
DT <- data.table(mydata)
DT[, {what you want to aggregate} , by =keynn]
Note that keynn doesn't need to be a character vector of names, it can be a list of names or a named list of functions of names etc

Resources