Is there anyway I can loop through some set of objects and apply a function to each?
When I type ls() or objects(), it returns a list of object names. I could like to iterate through this list, identify those which are data.frame, and then run a function against each object.
How do I pass an entry from ls or objects through a function?
The answer given by #jverzani about figuring out which objects are data frames is good. So let's start with that. But we want to select only the items that are data.frames. So we could do that this way:
#test data
df <- data.frame(a=1:10, b=11:20)
df2 <- data.frame(a=2:4, b=4:6)
notDf <- 1
dfs <- ls()[sapply(mget(ls(), .GlobalEnv), is.data.frame)]
the names of the data frames are now strings in the dfs object so you can pass them to other functions like so:
sapply( dfs, function(x) str( get( x ) ) )
I used the get() command to actually get the object by name (see the R FAQ for more about that)
I've answered your qeustion above, but I have a suspicion that if you would organize your data frames into list items your code would be MUCH more readable and easy to maintain. Obviously I can't say this with certainty, but I can't come up with a use case where iterating through all objects looking for the data frames is superior to keeping your data frames in a list and then calling each item in that list.
You can get an object from its name with get or mget and iterate with one of the apply type functions. For example,
sapply(mget(ls(), .GlobalEnv), is.data.frame)
will tell you which items in the global environment are data frames. To use within a function, you can specify an environment to the ls call.
You can loop through objects in environment using "eapply".
Throwing in another solution to the mix using inherits. It basically (a) gets all objects from the current environment and (b) checks if they inherit from a data frame.
sapply(sapply(ls(), get), inherits, 'data.frame')
You can use the function get() to refer to an object by name
# Create some objects
df <- data.frame(a=1:10)
dl <- list(a=1, b=2, c=3)
# Use `ls()` to return a list of object names
lso <- ls()
# Use `get()` to refer to specific objects
class(get(lso[1]))
[1] "data.frame"
# Using an apply function to evaluate the class
lapply(lso, function(x) class(get(x)))
[[1]]
[1] "data.frame"
[[2]]
[1] "list"
You can use Filter with is.data.frame and ls in mget to get a named list of in this case of data.frame objects.
This list can then be used e.g. in lapply to apply each element of the list to a function.
L <- Filter(is.data.frame, mget(ls()))
lapply(L, nrow)
Related
so we know that R have list() variable, and also know that R has function call names() to give names for variable. For example :
a=30
names(a)="number"
a
# number
# 30
But now, I want to give a list variable a name, like this :
b=list()
names(b)="number"
and it returns error message like this :
Error in names(b) = "number" :
'names' attribute [1] must be the same length as the vector [0]
What I have suppose to do? I do this because I need many list variables. Or, do you have another way so I can make many list variables without playing with its name?
Since #akrun doesn't need any more points, here is an example showing how you can assign names to a list:
lst <- list(a="one", b="two", c=c(1:3))
names(lst)
[1] "a" "b" "c"
names(lst) <- c("x", "y", "z")
> lst
$x
[1] "one"
$y
[1] "two"
$z
[1] 1 2 3
It seems as though you are interested in labeling the object itself rather than the elements in it. The key is that the names attribute for a list object is necessarily assigned to its elements. One option, since you said you have many list objects, is to store the lists in a big list and then you can assign names to the big list, and the elements within the list-objects can be named too.
allLists <- list('number' = list())
> allLists
$number
list()
Another option, you can make use of the label feature in the Hmisc package. It modifies most common objects in R to have a subclass "labelled" and so whenever you print the list it shows the label. It is good for documentation and organizing the workspace a bit better, but caveat it's very easy to accidentally cast labelled objects to a non-labelled class and to confuse methods that don't think to search for more than one class.
library(Hmisc)
p <- list()
label(p) <- 'number'
> p
number
list()
Another option is to make the "name" of your list object an actual element of the list. You'll see in a lot of complex R data structures, this is the preferred way of storing labels, titles, or names when such a need arises and isn't met by the base R data structure.
b <- list('name' = 'number')
The last possibility is that you need a placeholder to store the "names" attribute of the elements you haven't yet populated the list with. If the elements are of known length and of known type, you can allocate such a vector using e.g. numeric(1) a sort-of "primed" vector which can be named. If you don't know the data structure of your output, I would not use this approach since it can be a real memory hog to "build" data structures in R.
Other possibilities are
as.list(a)
# $`number`
# [1] 30
# or
setNames(list(unname(a)),'number')
# $`number`
# [1] 30
# or named list with named vector
setNames(list(a), 'number')
# $`number`
# number
# 30
If I want to create a named list, where I have named literals, I can just do this:
list(foo=1,bar=2,baz=3)
If instead I want to make a list with arbitrary computation, I can use lapply, so for example:
lapply(list(1,2,3), function(x) x)
However, the list generated by lapply will always be a regular numbered list. Is there a way I can generate a list using a function like lapply with names.
My idea is something along the lines of:
lapply(list("foo","bar","baz), function(key) {key=5}
==>
list(foo=5,bar=5,baz=5)
That way I don't have to have the keys and values as literals.
I do know that I could do this:
res = list()
for(key in list("foo","bar","baz") {
res[key] <- 5;
}
But I don't like how I have to create a empty list and mutate it to fill it out.
Edit: I would also like to do some computation based on the key. Something like this:
lapply(c("foo","bar","baz"), function(key) {paste("hello",key)=5})
sapply will use its argument for names if it is a character vector, so you can try:
sapply(c("foo","bar","baz"), function(key) 5, simplify=F)
Which produces:
$foo
[1] 5
$bar
[1] 5
$baz
[1] 5
If your list has names in the first place, lapply will preserve them
lapply(list(a=1,b=2,c=3), function(x) x)
or you can set names before or after with setNames()
#before
lapply(setNames(list(1,2,3),c("foo","bar","baz")), function(x) x)
#after
setNames(lapply(list(1,2,3), function(x) x), c("foo","bar","baz"))
One other "option" is Map(). Map will try to take the names from the first parameter you pass in. You can ignore the value in the function and use it only for the side-effect of keeping the name
Map(function(a,b) 5, c("foo","bar","baz"), list(1:3))
But names cannot be changed during lapply/Map steps. They can only be copied from another location. if you need to mutate names, you'll have to do that as a separate step.
I have variable a as string:
a = "jul_0_baseline,jul_1_baseline,...jul_11_baseline,jul_12_baseline"
When I try to merge the following zoo series to one table using:
temp <- merge(jul_0_baseline,jul_1_baseline,...jul_11_baseline,jul_12_baseline)
it works, however when I try to merge it using
temp <- merge(a)
I get an error as it the variable a is a string (even though the text is correct). I am assuming that it is effectively inputting
temp <- merge("jul_0_baseline,jul_1_baseline,...jul_11_baseline,jul_12_baseline")
Any help would be greatly appreciated
a is a string because it is created using the code:
a <- paste("jul","0","baseline",sep = "_")
for (d in 1:12){ b <- paste("jul",d,"baseline",sep = "_")
a <- paste(a,b, sep=",")
}
Form the entire command string (including merge) and then parse and evaluate it:
eval(parse(text = sprintf("merge(%s)", a)))
Since each jul_i_baseline listed in the string a corresponds to an actual object, you can do this:
temp <- Reduce(function(...) merge(..., all=T), mget(strsplit(a, ",")[[1]]))
The strsplit() function splits a into a vector of strings where each element is "jul_i_baseline". It returns a one-element list, so we can use [[1]] to get the vector of strings.
mget() interprets the list of variables in strings as objects. It returns a list where each element corresponds to the object. So each element is the actual Zoo object jul_i_baseline.
Reduce(function(...) merge(..., all=T), <list>) merges the objects stored in each element of the list. Assuming the objects have a common variable on which you want to merge, you can also add a by variable in merge().
An alternate approach as suggested in the comments is to use do.call() which would work since you're dealing with Zoo objects. (The former approach works with non-Zoo objects as well but this does not.) The command would be structured like so:
temp <- do.call(merge, mget(strsplit(c, ",")[[1]]))
Again we're getting the objects using mget() and strsplit().
#G.Grothendiek's suggestion of using eval(parse(...)) also works in this situation. However, many R users discourage the use of eval(parse(...)) in general. See here.
I have an character array (chr [1:5] named keynn) of column names on which I would like to perform an aggregation.
All elements of the array is a valid column name of the data frame (mydata), but it is a string and not the variable ("YEAR" instead of mydata$YEAR).
I tried using get() to return the column from the name and it works, for the first element, like so:
attach(mydata)
aggregate(mydata, by=list(get(keynn, .GlobalEnv)), FUN=length)
I tried using mget() since my array as more than one element, like this:
attach(mydata)
aggregate(mydata, by=list(mget(keynn, .GlobalEnv)), FUN=length)
but I get an error:
value for 'YEAR' not found.
How can I get the equivalent of get for multiple columns to aggregate by?
Thank you!
I would suggest not using attach in general
If you are just trying to get columns from mydata you can use [ to index the list
aggregate(mydata, by = mydata[keynn], FUN = length)
should work -- and is very clear that you want to get keynn from mydata
The problem with using attach is that it adds mydata to the search path (not copying to the global environment)
try
attach(mydata)
mget(keynn, .GlobalEnv)
so if you were to use mget and attach, you need
mget(keynn, .GlobalEnv, inherits = TRUE)
so that it will not just search in the global environment.
But that is more effort than it is worth (IMHO)
The reason get works is that inherits = TRUE by default. You could thus use lapply(keynn, get) if mydata were attached, but again this ugly and unclear about what it is doing.
another approach would be to use data.table, which will evaluate the by argument within the data.table in question
library(data.table)
DT <- data.table(mydata)
DT[, {what you want to aggregate} , by =keynn]
Note that keynn doesn't need to be a character vector of names, it can be a list of names or a named list of functions of names etc
I would like to be able to subset the list of objects in my Global Environment by class.
i.e. from the list created by running
ls()
I would like to be able to make a shorter list that only has the names of the objects that belong to specific class e.g. xts or POSIXlt
Thanks in advance
This is a slight twist to the above which uses inherits to inspect the object:
objs = mget(ls(envir=.GlobalEnv), envir=.GlobalEnv)
names(Filter(function(i) inherits(i, "lm"), objs))
The function(i) inherits(i, "lm") can be adjusted as you want.
You could retrieve ls() and check the class of everything. It may not be particularly efficient though, as it does the filtering after ls() and not within.
# populate global environment with some vars.
rm(list=ls())
a <- 1
b <- 2
c <- 'foo'
d <- 'asdf'
lst <- ls()
# return everything 'numeric':
lst[sapply(lst,function(var) any(class(get(var))=='numeric'))]
# 'a' 'b'
The get(var) gets the variable corresponding to the string in var, so if var is "a" then get(var) retrieves 1 (being the value of variable a).
As #VincentZoonekynd notes below - it is possible for objects to have multiple classes. Soo class(some_xts_object) is c("xts","zoo") -- the above method will return some_xts_object if you search for xts objects, but also if you search for zoo objects.