subset() drops attributes on vectors; how to maintain/persist them?

subset() drops attributes on vectors; how to maintain/persist them? - r

Let's say I have a vector where I've set a few attributes:
vec <- sample(50:100,1000, replace=TRUE)
attr(vec, "someattr") <- "Hello World"
When I subset the vector, the attributes are dropped. For example:
tmp.vec <- vec[which(vec > 80)]
attributes(tmp.vec) # Now NULL
Is there a way to, subset and persist attributes without having to save them to another temporary object?
Bonus: Where would one find documentation of this behaviour?

I would write a method for [ or subset() (depending on how you are subsetting) and arrange for that to preserve the attributes. That would need a "class" attribute also adding to your vector so that dispatch occurs.
vec <- 1:10
attr(vec, "someattr") <- "Hello World"
class(vec) <- "foo"
At this point, subsetting removes attributes:
> vec[1:5]
[1] 1 2 3 4 5
If we add a method [.foo we can preserve the attributes:
`[.foo` <- function(x, i, ...) {
attrs <- attributes(x)
out <- unclass(x)
out <- out[i]
attributes(out) <- attrs
out
}
Now the desired behaviour is preserved
> vec[1:5]
[1] 1 2 3 4 5
attr(,"someattr")
[1] "Hello World"
attr(,"class")
[1] "foo"
And the answer to the bonus question:
From ?"[" in the details section:
Subsetting (except by an empty index) will drop all attributes except names, dim and dimnames.

Thanks to a similar answer to my question #G. Grothendieck, you can use collapse::fsubset see here.
library(collapse)
#tmp_vec <- fsubset(vec, vec > 80)
tmp_vec <- sbt(vec, vec > 80) # Shortcut for fsubset
attributes(tmp_vec)
# $someattr
# [1] "Hello World"

Related

How do I force my vector to split into arguments?

I'm having problems with applying a function to a vector of arguments. The point is, none of the arguments are vectors.
I'm trying to apply my function with the do.call command, and my attempts go like this:
do.call("bezmulti", list(dat$t, as.list(getvarnames(n, "a"))))
where bezmulti is a function that takes in a vector (dat$t) and an indefinite number of single numbers, which are provided by the function getvarnames in the form of a vector, which I need to split.
The problem is that this list doesn't work the way I want it to - the way I would want would be:
[[1]]
#vector goes here
[[2]]
#the
[[3]]
#numbers
[[4]]
#go
[[5]]
#here
however my proposed solution, and all my other solutions provide lists that are either somehow nested or have only two elements, both of which are vectors. Is there a way to force the list to be in the format above?
EDIT: Functions used in this post look as follows
bezmulti <- function(t,...) {
coeff <- list(...)
n <- length(coeff)-1
sumco <- rep(0, length(t))
for (i in c(0:n)) {
sumco=sumco+coeff[[i+1]]*choose(n, i)*(1-t)^(n-i)*t^i
}
return(sumco)
}
getvarnames <- function(n, charasd) {
vec=NULL
for (j in c(1:n)) {
vec <- append(vec, eval(str2expression(paste0(charasd, as.character(j)))))
}
return(vec)
}

I think what you need to do is this:
do.call("bezmulti", c(list(dat$t), as.list(getvarnames(n, "a"))))
For example:
dat= data.frame(t = c(1,2,3,4,6))
c(list(dat$t), as.list(c(8,10,12)))
Output:
[[1]]
[1] 1 2 3 4 6
[[2]]
[1] 8
[[3]]
[1] 10
[[4]]
[1] 12

Extracting coefficients while looping over variable names

I'm working on some time-series stuff in R (version 3.4.1), and would like to extract coefficients from regressions I ran, in order to do further analysis.
All results are so far saved as uGARCHfit objects, which are basically complicated list objects, from which I want to extract the coefficients in the following manner.
What I want is in essence this:
for(i in list){
i_GARCH_mxreg <- i_GARCH#fit$robust.matcoef[5,1]
}
"list" is a list object, where every element is the name of one observation. For now, I want my loop to create a new numeric object named as I specified in the loop.
Now this obviously doesn't work because the index, 'i', isn't replaced as I would want it to be.
How do I rewrite my loop appropriately?
Minimal working example:
list <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
for (i in list){
i_b <- i_a
}
what this should give me would be:
> one_b
[1] 1
> two_b
[1] 2
> three_b
[1] 3
Clarification:
I want to extract the coefficients form multiple list objects. These are named in the manner 'string'_obj. The problem is that I don't have a function that would extract these coefficients, the list "is not subsettable", so I have to call the individual objects via obj#fit$robust.matcoef[5,1] (or is there another way?). I wanted to use the loop to take my list of strings, and in every iteration, take one string, add 'string'_obj#fit$robust.matcoef[5,1], and save this value into an object, named again with " 'string'_name "
It might well be easier to have this into a list rather than individual objects, as someone suggest lapply, but this is not my primary concern right now.
There is likely an easy way to do this, but I am unable to find it. Sorry for any confusion and thanks for any help.

The following should match your desired output:
# your list
l <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
# my workspace: note that there is no one_b, two_b, three_b
ls()
[1] "l" "one_a" "three_a" "two_a"
for (i in l){
# first, let's define the names as characters, using paste:
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# then let's assign the values. Since we are working with
# characters, the functions assign and get come in handy:
assign(dest, get(orig) )
}
# now let's check my workspace again. Note one_b, two_b, three_b
ls()
[1] "dest" "i" "l" "one_a" "one_b" "orig" "three_a"
[8] "three_b" "two_a" "two_b"
# let's check that the values are correct:
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3
To comment on the functions used: assign takes a character as first argument, which is supposed to be the name of the newly created object. The second argument is the value of that object. get takes a character and looks up the value of the object in the workspace with the same name as that character. For instance, get("one_a") will yield 1.
Also, just to follow up on my comment earlier: If we already had all the coefficients in a list, we could do the following:
# hypothetical coefficients stored in list:
lcoefs <- list(1,2,3)
# let's name the coefficients:
lcoefs <- setNames(lcoefs, paste0(c("one", "two", "three"), "_c"))
# push them into the global environment:
list2env(lcoefs, env = .GlobalEnv)
# look at environment:
ls()
[1] "dest" "i" "l" "lcoefs" "one_a" "one_b" "one_c"
[8] "orig" "three_a" "three_b" "three_c" "two_a" "two_b" "two_c"
one_c
[1] 1
two_c
[1] 2
three_c
[1] 3
And to address the comments, here a slightly more realistic example, taking the list-structure into account:
l <- as.list(c("one", "two", "three"))
# let's "hide" the values in a list:
one_a <- list(val = 1)
two_a <- list(val = 2)
three_a <- list(val = 3)
for (i in l){
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# let's get the list-object:
tmp <- get(orig)
# extract value:
val <- tmp$val
assign(dest, val )
}
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3

R get objects' names from the list of objects

I try to get an object's name from the list containing this object. I searched through similar questions and find some suggestions about using the deparse(substitute(object)) formula:
> my.list <- list(model.product, model.i, model.add)
> lapply(my.list, function(model) deparse(substitute(model)))
and the result is:
[[1]]
[1] "X[[1L]]"
[[2]]
[1] "X[[2L]]"
[[3]]
[1] "X[[3L]]"
whereas I want to obtain:
[1] "model.product", "model.i", "model.add"
Thank you in advance for being of some help!

You can write your own list() function so it behaves like data.frame(), i.e., uses the un-evaluated arg names as entry names:
List <- function(...) {
names <- as.list(substitute(list(...)))[-1L]
setNames(list(...), names)
}
my.list <- List(model.product, model.i, model.add)
Then you can just access the names via:
names(my.list)

names(my.list) #..............
Oh wait, you didn't actually create names did you? There is actually no "memory" for the list function. It returns a list with the values of its arguments but not from whence they came, unless you add names to the pairlist given as the argument.

You won't be able to extract the information that way once you've created my.list.
The underlying way R works is that expressions are not evaluated until they're needed; using deparse(substitute()) will only work before the expression has been evaluated. So:
deparse(substitute(list(model.product, model.i, model.add)))
should work, while yours doesn't.

To save stuffing around, you could employ mget to collect your free-floating variables into a list with the names included:
one <- two <- three <- 1
result <- mget(c("one","two","three"))
result
#$one
#[1] 1
#
#$two
#[1] 1
#
#$three
#[1] 1
Then you can follow #DWin's suggestion:
names(result)
#[1] "one" "two" "three"

Access atomic vectors shown by Filter(is.atomic, eq) in R

Filter(is.atomic, something)
returns atomic vectors.
1. Weather -example here
> Filter(is.atomic, study)
$region
[1] "Hamburg" "Bremen"
2. mosaic-plot-as-tree-plot -example here
> Map(function(x) Filter(is.atomic, x), ls())
$g
[1] "g"
$lookup
[1] "lookup"
$req.data
[1] "req.data"
$tmp
[1] "tmp"
$tmp1
[1] "tmp1"
Look their positions can be arbitrary, I may have faintest clue of their data-structure so cannot use var$some$...$vector. I feel the need of ?Position. Use your imagination, the examples are not exclusive. How can I access their atomic vectors?

To flatten a list so you can access the atomic vectors, you can use following function:
flatten.list <- function(x){
y <- list()
while(is.list(x)){
id <- sapply(x,is.atomic)
y <- c(y,x[id])
x <- unlist(x[!id],recursive=FALSE)
}
y
}
This function maintains names of the elements. Usage, using the list x from Vincent's answer :
x <- list(
list(1:3, 4:6),
7:8,
list( list( list(9:11, 12:15), 16:20 ), 21:24 )
)
then:
> flatten.list(x)
[[1]]
[1] 7 8
[[2]]
[1] 1 2 3
[[3]]
[1] 4 5 6
[[4]]
[1] 21 22 23 24
...
To recursively do an action on all atomic elements in a list, use rapply() (which is what Vincent handcoded basically).
> rapply(x,sum)
[1] 6 15 15 30 54 90 90
> rapply(x,sum,how='list')
[[1]]
[[1]][[1]]
[1] 6
[[1]][[2]]
[1] 15
[[2]]
[1] 15
...
See also ?rapply
PS : Your code Map(function(x) Filter(is.atomic, x), ls()) doesn't make sense. ls() returns a character vector, so every element of that character vector will be returned as part of the list. This doesn't tell you anything at all.
Next to that, Filter() doesn't do what you believe it does. Taking the example list x, from the answer of Vincent, accessing only the atomic parts of it is pretty easy. Filter() only returns the second element. That's the only atomic element. Filter(is.atomic, x) is 100% equivalent to:
ind <- sapply(x, is.atomic)
x[ind]

Your question is very unclear, to say the least: an example of the input data you have and the desired output would help...
Since you suggest that we "use our imagination", I assume that you have a hierarchical data structure, i.e., a list of lists of...of lists, whose depth is unknown. For instance,
x <- list(
list(1:3, 4:6),
7:8,
list( list( list(9:11, 12:15), 16:20 ), 21:24 )
)
The leaves are vectors, and you want to do "something" with those vectors.
For instance, you may want to concatenate them into a single vector: that is what the unlist function does.
unlist(x)
You could also want all the leaves, in a list, i.e., a list of vectors.
You can easily write a (recursive) function that explores the data structure and progressively builds that list, as follows.
leaves <- function(u) {
if( is.atomic(u) ) { return( list(u) ) }
result <- list()
for(e in u) {
result <- append( result, leaves(e) )
}
return(result)
}
leaves(x)
You could also want to apply a function to all the leaves, while preserving the structure of the data.
happly <- function(u, f, ...) {
if(is.atomic(u)) { return(f(u,...)) }
result <- lapply(u, function(v) NULL) # List of NULLs, with the same names
for(i in seq_along(u)) {
result[[i]] <- happly( u[[i]], f, ... )
}
return( result )
}
happly(x, range) # Apply the "range" function to all the leaves

Filter will return a list. The functions lapply and sapply are typically used to process individual elements of a list object. If you instead want to access them by number using "[" or "[[" then you can determine the range of acceptable numbers with length(object). So object[[length(object)]] would get you the last element (as would ( tail(object, 1) ).

R search variable in all list objects [duplicate]

Filter(is.atomic, something)
returns atomic vectors.
1. Weather -example here
> Filter(is.atomic, study)
$region
[1] "Hamburg" "Bremen"
2. mosaic-plot-as-tree-plot -example here
> Map(function(x) Filter(is.atomic, x), ls())
$g
[1] "g"
$lookup
[1] "lookup"
$req.data
[1] "req.data"
$tmp
[1] "tmp"
$tmp1
[1] "tmp1"
Look their positions can be arbitrary, I may have faintest clue of their data-structure so cannot use var$some$...$vector. I feel the need of ?Position. Use your imagination, the examples are not exclusive. How can I access their atomic vectors?

To flatten a list so you can access the atomic vectors, you can use following function:
flatten.list <- function(x){
y <- list()
while(is.list(x)){
id <- sapply(x,is.atomic)
y <- c(y,x[id])
x <- unlist(x[!id],recursive=FALSE)
}
y
}
This function maintains names of the elements. Usage, using the list x from Vincent's answer :
x <- list(
list(1:3, 4:6),
7:8,
list( list( list(9:11, 12:15), 16:20 ), 21:24 )
)
then:
> flatten.list(x)
[[1]]
[1] 7 8
[[2]]
[1] 1 2 3
[[3]]
[1] 4 5 6
[[4]]
[1] 21 22 23 24
...
To recursively do an action on all atomic elements in a list, use rapply() (which is what Vincent handcoded basically).
> rapply(x,sum)
[1] 6 15 15 30 54 90 90
> rapply(x,sum,how='list')
[[1]]
[[1]][[1]]
[1] 6
[[1]][[2]]
[1] 15
[[2]]
[1] 15
...
See also ?rapply
PS : Your code Map(function(x) Filter(is.atomic, x), ls()) doesn't make sense. ls() returns a character vector, so every element of that character vector will be returned as part of the list. This doesn't tell you anything at all.
Next to that, Filter() doesn't do what you believe it does. Taking the example list x, from the answer of Vincent, accessing only the atomic parts of it is pretty easy. Filter() only returns the second element. That's the only atomic element. Filter(is.atomic, x) is 100% equivalent to:
ind <- sapply(x, is.atomic)
x[ind]

Your question is very unclear, to say the least: an example of the input data you have and the desired output would help...
Since you suggest that we "use our imagination", I assume that you have a hierarchical data structure, i.e., a list of lists of...of lists, whose depth is unknown. For instance,
x <- list(
list(1:3, 4:6),
7:8,
list( list( list(9:11, 12:15), 16:20 ), 21:24 )
)
The leaves are vectors, and you want to do "something" with those vectors.
For instance, you may want to concatenate them into a single vector: that is what the unlist function does.
unlist(x)
You could also want all the leaves, in a list, i.e., a list of vectors.
You can easily write a (recursive) function that explores the data structure and progressively builds that list, as follows.
leaves <- function(u) {
if( is.atomic(u) ) { return( list(u) ) }
result <- list()
for(e in u) {
result <- append( result, leaves(e) )
}
return(result)
}
leaves(x)
You could also want to apply a function to all the leaves, while preserving the structure of the data.
happly <- function(u, f, ...) {
if(is.atomic(u)) { return(f(u,...)) }
result <- lapply(u, function(v) NULL) # List of NULLs, with the same names
for(i in seq_along(u)) {
result[[i]] <- happly( u[[i]], f, ... )
}
return( result )
}
happly(x, range) # Apply the "range" function to all the leaves

Filter will return a list. The functions lapply and sapply are typically used to process individual elements of a list object. If you instead want to access them by number using "[" or "[[" then you can determine the range of acceptable numbers with length(object). So object[[length(object)]] would get you the last element (as would ( tail(object, 1) ).