If I want to create a named list, where I have named literals, I can just do this:
list(foo=1,bar=2,baz=3)
If instead I want to make a list with arbitrary computation, I can use lapply, so for example:
lapply(list(1,2,3), function(x) x)
However, the list generated by lapply will always be a regular numbered list. Is there a way I can generate a list using a function like lapply with names.
My idea is something along the lines of:
lapply(list("foo","bar","baz), function(key) {key=5}
==>
list(foo=5,bar=5,baz=5)
That way I don't have to have the keys and values as literals.
I do know that I could do this:
res = list()
for(key in list("foo","bar","baz") {
res[key] <- 5;
}
But I don't like how I have to create a empty list and mutate it to fill it out.
Edit: I would also like to do some computation based on the key. Something like this:
lapply(c("foo","bar","baz"), function(key) {paste("hello",key)=5})
sapply will use its argument for names if it is a character vector, so you can try:
sapply(c("foo","bar","baz"), function(key) 5, simplify=F)
Which produces:
$foo
[1] 5
$bar
[1] 5
$baz
[1] 5
If your list has names in the first place, lapply will preserve them
lapply(list(a=1,b=2,c=3), function(x) x)
or you can set names before or after with setNames()
#before
lapply(setNames(list(1,2,3),c("foo","bar","baz")), function(x) x)
#after
setNames(lapply(list(1,2,3), function(x) x), c("foo","bar","baz"))
One other "option" is Map(). Map will try to take the names from the first parameter you pass in. You can ignore the value in the function and use it only for the side-effect of keeping the name
Map(function(a,b) 5, c("foo","bar","baz"), list(1:3))
But names cannot be changed during lapply/Map steps. They can only be copied from another location. if you need to mutate names, you'll have to do that as a separate step.
Related
I already have a list contains all functions in dplyr by using this code
content <- mget(ls("package:dplyr"), inherits = TRUE)
dplyr_functions <- Filter(is.function, content)
The result I wanna get is just like the result of
names(dplyr_functions)
It will be a chr vector containing all function names in dplyr package.
But when I use map() function, my code is like:
dplyr_name <- map_chr(dplyr_functions, names)
There is an error said,
"Result 1 must be a single string, not NULL of length 0"
So I just want to know what the error mean? How can I use map_chr to get a vector containing all names in dplyr_functions?
map loop through the list element's content "value" e.g. dplyr_functions[[1]] and so on, not through the element as in dplyr_functions[1], try both to see the difference. Hence names(dplyr_functions[[1]]) returns NULL and map_chr fails, while names(dplyr_functions[1]) returns %>% and map_chr could work.
So we can loop through the list index and subset using the 2nd method or use imap which designed to loop through the list names.
library(purrr)
map_chr(seq_along(dplyr_functions), ~names(dplyr_functions[.x]))
#or
imap_chr(dplyr_functions, ~.y) %>% unname()
I have a problem with elipsis usecase. My function accepts list of objects, let's call them objects of class "X". Now, objects X are being processed inside of my function to class "Xs", so I have list of "Xs" objects. Function that I import from other package can compute multiple "Xs" objects at once but they have to be enumerated (elipsis mechanic), not passed as list. Is there a way how to solve it? I want something like this
examplefun <- function(charlist){
nums <- lapply(charlist, as.numeric)
sum(... = nums)
}
Of course example above throws an error but it shows what i want to achieve. I tried to unlist with recursive = FALSE ("X" and "Xs" are the list itself) but it does not work.
If there is no solution then:
Let's assume I decideed to accept ... insted of list of "X" objects. Can I modify elipsis elements (change them to "Xs") and then pass to function that accepts elipsis? So it will look like this:
examplefun2 <- function(...){
function that modify object in ... to "Xs" objects
sum(...)
}
In your first function, just call sum directly because sum works correctly on vectors of numbers instead of individual numbers.
examplefun <- function (charlist) {
nums <- vapply(charlist, as.numeric, numeric(1L))
sum(nums)
}
(Note the use of vapply instead of lapply: sum expects an atomic vector, we can’t pass a list.)
In your second function, you can capture ... and work with the captured variable:
examplefun2 <- function (...) {
nums <- as.numeric(c(...))
sums(nums)
}
For more complex arguments, Roland’s comment is a good alternative: Modify the function arguments as a list, and pass it to do.call.
I have a function that takes two arguments:
my.function <- function(name, value) {
print(name)
print(value) #using print as example
}
I have an integer vector that has names and values:
freq.chars <- table(sample(LETTERS[1:5], 10, replace=TRUE))
I'd like to use lapply to apply my.function to freq.chars where the name of each item is passed in as x, and the value (in this case frequency) is passed in as y.
When I try,
lapply(names(freq.chars), my.function)
I get an error that "value" is missing with no default.
I've also tried
lapply(names(freq.chars), my.function, name = names(freq.chars), value = freq.chars)
, in which case I get an error: unused argument value = c(...).
Sorry for the edits and clarity, I'm new at this...
We use this test data:
set.seed(123) # needed for reproducibility
char.vector <- sample(LETTERS[1:5], 10, replace=TRUE)
freq.chars <- table(char.vector)
Here are several variations:
# 1. iterate simultaneously over names and values
mapply(my.function, names(freq.chars), unname(freq.chars))
# 2. same code except Map replaces mapply. Map returns a list.
Map(my.function, names(freq.chars), unname(freq.chars))
# 3. iterate over index and then turn index into names and values
sapply(seq_along(freq.chars),
function(i) my.function(names(freq.chars)[i], unname(freq.chars)[i]))
# 4. same code as last one except lapply replaces sapply. Returns list.
lapply(seq_along(freq.chars),
function(i) my.function(names(freq.chars)[i], unname(freq.chars)[i]))
# 5. this iterates over names rather than over an index
sapply(names(freq.chars), function(nm) my.function(nm, freq.chars[[nm]]))
# 6. same code as last one except lapply replaces sapply. Returns list.
lapply(names(freq.chars), function(nm) my.function(nm, freq.chars[[nm]]))
Note that mapply and sapply have an optional USE.NAMES argument that controls whether names are inferred for the result and an optional simplify argument ('SIMPLIFYformapply`) which controls whether list output is simplified. Use these arguments for further control.
Update Completely revised presentation.
If you just want to add another parameter to your function, specify it after the function name (3rd parm in lapply).
lapply(names(freq.chars), my.function, char.vector)
I have an character array (chr [1:5] named keynn) of column names on which I would like to perform an aggregation.
All elements of the array is a valid column name of the data frame (mydata), but it is a string and not the variable ("YEAR" instead of mydata$YEAR).
I tried using get() to return the column from the name and it works, for the first element, like so:
attach(mydata)
aggregate(mydata, by=list(get(keynn, .GlobalEnv)), FUN=length)
I tried using mget() since my array as more than one element, like this:
attach(mydata)
aggregate(mydata, by=list(mget(keynn, .GlobalEnv)), FUN=length)
but I get an error:
value for 'YEAR' not found.
How can I get the equivalent of get for multiple columns to aggregate by?
Thank you!
I would suggest not using attach in general
If you are just trying to get columns from mydata you can use [ to index the list
aggregate(mydata, by = mydata[keynn], FUN = length)
should work -- and is very clear that you want to get keynn from mydata
The problem with using attach is that it adds mydata to the search path (not copying to the global environment)
try
attach(mydata)
mget(keynn, .GlobalEnv)
so if you were to use mget and attach, you need
mget(keynn, .GlobalEnv, inherits = TRUE)
so that it will not just search in the global environment.
But that is more effort than it is worth (IMHO)
The reason get works is that inherits = TRUE by default. You could thus use lapply(keynn, get) if mydata were attached, but again this ugly and unclear about what it is doing.
another approach would be to use data.table, which will evaluate the by argument within the data.table in question
library(data.table)
DT <- data.table(mydata)
DT[, {what you want to aggregate} , by =keynn]
Note that keynn doesn't need to be a character vector of names, it can be a list of names or a named list of functions of names etc
Is there anyway I can loop through some set of objects and apply a function to each?
When I type ls() or objects(), it returns a list of object names. I could like to iterate through this list, identify those which are data.frame, and then run a function against each object.
How do I pass an entry from ls or objects through a function?
The answer given by #jverzani about figuring out which objects are data frames is good. So let's start with that. But we want to select only the items that are data.frames. So we could do that this way:
#test data
df <- data.frame(a=1:10, b=11:20)
df2 <- data.frame(a=2:4, b=4:6)
notDf <- 1
dfs <- ls()[sapply(mget(ls(), .GlobalEnv), is.data.frame)]
the names of the data frames are now strings in the dfs object so you can pass them to other functions like so:
sapply( dfs, function(x) str( get( x ) ) )
I used the get() command to actually get the object by name (see the R FAQ for more about that)
I've answered your qeustion above, but I have a suspicion that if you would organize your data frames into list items your code would be MUCH more readable and easy to maintain. Obviously I can't say this with certainty, but I can't come up with a use case where iterating through all objects looking for the data frames is superior to keeping your data frames in a list and then calling each item in that list.
You can get an object from its name with get or mget and iterate with one of the apply type functions. For example,
sapply(mget(ls(), .GlobalEnv), is.data.frame)
will tell you which items in the global environment are data frames. To use within a function, you can specify an environment to the ls call.
You can loop through objects in environment using "eapply".
Throwing in another solution to the mix using inherits. It basically (a) gets all objects from the current environment and (b) checks if they inherit from a data frame.
sapply(sapply(ls(), get), inherits, 'data.frame')
You can use the function get() to refer to an object by name
# Create some objects
df <- data.frame(a=1:10)
dl <- list(a=1, b=2, c=3)
# Use `ls()` to return a list of object names
lso <- ls()
# Use `get()` to refer to specific objects
class(get(lso[1]))
[1] "data.frame"
# Using an apply function to evaluate the class
lapply(lso, function(x) class(get(x)))
[[1]]
[1] "data.frame"
[[2]]
[1] "list"
You can use Filter with is.data.frame and ls in mget to get a named list of in this case of data.frame objects.
This list can then be used e.g. in lapply to apply each element of the list to a function.
L <- Filter(is.data.frame, mget(ls()))
lapply(L, nrow)