R split numeric vector at position - r

I am wondering about the simple task of splitting a vector into two at a certain index:
splitAt <- function(x, pos){
list(x[1:pos-1], x[pos:length(x)])
}
a <- c(1, 2, 2, 3)
> splitAt(a, 4)
[[1]]
[1] 1 2 2
[[2]]
[1] 3
My question: There must be some existing function for this, but I can't find it? Is maybe split a possibility? My naive implementation also does not work if pos=0 or pos>length(a).

An improvement would be:
splitAt <- function(x, pos) unname(split(x, cumsum(seq_along(x) %in% pos)))
which can now take a vector of positions:
splitAt(a, c(2, 4))
# [[1]]
# [1] 1
#
# [[2]]
# [1] 2 2
#
# [[3]]
# [1] 3
And it does behave properly (subjective) if pos <= 0 or pos >= length(x) in the sense that it returns the whole original vector in a single list item. If you'd like it to error out instead, use stopifnot at the top of the function.

I tried to use flodel's answer, but it was too slow in my case with a very large x (and the function has to be called repeatedly). So I created the following function that is much faster, but also very ugly and doesn't behave properly. In particular, it doesn't check anything and will return buggy results at least for pos >= length(x) or pos <= 0 (you can add those checks yourself if you're unsure about your inputs and not too concerned about speed), and perhaps some other cases as well, so be careful.
splitAt2 <- function(x, pos) {
out <- list()
pos2 <- c(1, pos, length(x)+1)
for (i in seq_along(pos2[-1])) {
out[[i]] <- x[pos2[i]:(pos2[i+1]-1)]
}
return(out)
}
However, splitAt2 runs about 20 times faster with an x of length 106:
library(microbenchmark)
W <- rnorm(1e6)
splits <- cumsum(rep(1e5, 9))
tm <- microbenchmark(
splitAt(W, splits),
splitAt2(W, splits),
times=10)
tm

Another alternative that might be faster and/or more readable/elegant than flodel's solution:
splitAt <- function(x, pos) {
unname(split(x, findInterval(x, pos)))
}

Related

How to concisely deal with subsets when their lengths become zero?

To exclude elements from a vector x,
x <- c(1, 4, 3, 2)
we can subtract a vector of positions:
excl <- c(2, 3)
x[-excl]
# [1] 1 2
This also works dynamically,
(excl <- which(x[-which.max(x)] > quantile(x, .25)))
# [1] 2 3
x[-excl]
# [1] 1 2
until excl is of length zero:
excl.nolength <- which(x[-which.max(x)] > quantile(x, .95))
length(excl.nolength)
# [1] 0
x[-excl.nolength]
# integer(0)
I could kind of reformulate that, but I have many objects to which excl is applied, say:
letters[1:4][-excl.nolength]
# character(0)
I know I could use setdiff, but that's rather long and hard to read:
x[setdiff(seq(x), excl.nolength)]
# [1] 1 4 3 2
letters[1:4][setdiff(seq(letters[1:4]), excl.nolength)]
# [1] "a" "b" "c" "d"
Now, I could exploit the fact that nothing is excluded if the element number is greater than the number of elements:
length(x)
# [1] 4
x[-5]
# [1] 1 4 3 2
To generalize that I should probably use .Machine$integer.max:
tmp <- which(x[-which.max(x)] > quantile(x, .95))
excl <- if (!length(tmp) == 0) tmp else .Machine$integer.max
x[-excl]
# [1] 1 4 3 2
Wrapped into a function,
e <- function(x) if (!length(x) == 0) x else .Machine$integer.max
that's quite handy and clear:
x[-e(excl)]
# [1] 1 2
x[-e(excl.nolength)]
# [1] 1 4 3 2
letters[1:4][-e(excl.nolength)]
# [1] "a" "b" "c" "d"
But it seems a little fishy to me...
Is there a better equally concise way to deal with a subset of length zero in base R?
Edit
excl comes out as dynamic result of a function before (as shown with which above) and might be of length zero or not. If length(excl) == 0 nothing should be excluded. Following lines of code, e.g. x[-excl] should not have to be changed at best or as little as possible.
You can overwrite [ with your own function.
"[" <- function(x,y) {if(length(y)==0) x else .Primitive("[")(x,y)}
x <- c(1, 4, 3, 2)
excl <- c(2, 3)
x[-excl]
#[1] 1 2
excl <- integer()
x[-excl]
#[1] 1 4 3 2
rm("[") #Go back to normal mode
I would argue this is somewhat opinion based.
For example i find:
x <- x[-if(length(excl <- which(x[-which.max(x)] > quantile(x, .95))) == 0) .Machine$integer.max else excl]
very unreadable, but some people like one-liners. Reading package code you'll often find this is instead split up into one of the many suggestions you gave
excl <- which(x[-which.max(x)] > quantile(x, .95))
if(length(excl) != 0)
x <- x[-excl]
Alternatively, you could avoid which, and simply use the logical vector for subsetting, and this would likely be considered more clean by most
x <- x[!x[-which.max(x)] > quantile(x, .95)]
This would avoid zero-length index problem, at the cost of some loss of efficiency.
As a side note, the very example used above and in the question seems somewhat off. First which.max only returns the first index which is equal to the max value, and in addition the index will be offset for every value removed. More likely the expected example would be
x <- x[!(x > quantile(x, .95))[-which(x == max(x))]]
How bout this?
a <- letters[1:3]
excl1 <- c(1,3)
excl2 <- c()
a[!(seq_along(a) %in% excl1)]
a[!(seq_along(a) %in% excl2)]

Why can't I assign to multiple variables using mapply/assign? [duplicate]

I want to assign multiple variables in a single line in R. Is it possible to do something like this?
values # initialize some vector of values
(a, b) = values[c(2,4)] # assign a and b to values at 2 and 4 indices of 'values'
Typically I want to assign about 5-6 variables in a single line, instead of having multiple lines. Is there an alternative?
I put together an R package zeallot to tackle this very problem. zeallot includes an operator (%<-%) for unpacking, multiple, and destructuring assignment. The LHS of the assignment expression is built using calls to c(). The RHS of the assignment expression may be any expression which returns or is a vector, list, nested list, data frame, character string, date object, or custom objects (assuming there is a destructure implementation).
Here is the initial question reworked using zeallot (latest version, 0.0.5).
library(zeallot)
values <- c(1, 2, 3, 4) # initialize a vector of values
c(a, b) %<-% values[c(2, 4)] # assign `a` and `b`
a
#[1] 2
b
#[1] 4
For more examples and information one can check out the package vignette.
There is a great answer on the Struggling Through Problems Blog
This is taken from there, with very minor modifications.
USING THE FOLLOWING THREE FUNCTIONS
(Plus one for allowing for lists of different sizes)
# Generic form
'%=%' = function(l, r, ...) UseMethod('%=%')
# Binary Operator
'%=%.lbunch' = function(l, r, ...) {
Envir = as.environment(-1)
if (length(r) > length(l))
warning("RHS has more args than LHS. Only first", length(l), "used.")
if (length(l) > length(r)) {
warning("LHS has more args than RHS. RHS will be repeated.")
r <- extendToMatch(r, l)
}
for (II in 1:length(l)) {
do.call('<-', list(l[[II]], r[[II]]), envir=Envir)
}
}
# Used if LHS is larger than RHS
extendToMatch <- function(source, destin) {
s <- length(source)
d <- length(destin)
# Assume that destin is a length when it is a single number and source is not
if(d==1 && s>1 && !is.null(as.numeric(destin)))
d <- destin
dif <- d - s
if (dif > 0) {
source <- rep(source, ceiling(d/s))[1:d]
}
return (source)
}
# Grouping the left hand side
g = function(...) {
List = as.list(substitute(list(...)))[-1L]
class(List) = 'lbunch'
return(List)
}
Then to execute:
Group the left hand side using the new function g()
The right hand side should be a vector or a list
Use the newly-created binary operator %=%
# Example Call; Note the use of g() AND `%=%`
# Right-hand side can be a list or vector
g(a, b, c) %=% list("hello", 123, list("apples, oranges"))
g(d, e, f) %=% 101:103
# Results:
> a
[1] "hello"
> b
[1] 123
> c
[[1]]
[1] "apples, oranges"
> d
[1] 101
> e
[1] 102
> f
[1] 103
Example using lists of different sizes:
Longer Left Hand Side
g(x, y, z) %=% list("first", "second")
# Warning message:
# In `%=%.lbunch`(g(x, y, z), list("first", "second")) :
# LHS has more args than RHS. RHS will be repeated.
> x
[1] "first"
> y
[1] "second"
> z
[1] "first"
Longer Right Hand Side
g(j, k) %=% list("first", "second", "third")
# Warning message:
# In `%=%.lbunch`(g(j, k), list("first", "second", "third")) :
# RHS has more args than LHS. Only first2used.
> j
[1] "first"
> k
[1] "second"
Consider using functionality included in base R.
For instance, create a 1 row dataframe (say V) and initialize your variables in it. Now you can assign to multiple variables at once V[,c("a", "b")] <- values[c(2, 4)], call each one by name (V$a), or use many of them at the same time (values[c(5, 6)] <- V[,c("a", "b")]).
If you get lazy and don't want to go around calling variables from the dataframe, you could attach(V) (though I personally don't ever do it).
# Initialize values
values <- 1:100
# V for variables
V <- data.frame(a=NA, b=NA, c=NA, d=NA, e=NA)
# Assign elements from a vector
V[, c("a", "b", "e")] = values[c(2,4, 8)]
# Also other class
V[, "d"] <- "R"
# Use your variables
V$a
V$b
V$c # OOps, NA
V$d
V$e
here is my idea. Probably the syntax is quite simple:
`%tin%` <- function(x, y) {
mapply(assign, as.character(substitute(x)[-1]), y,
MoreArgs = list(envir = parent.frame()))
invisible()
}
c(a, b) %tin% c(1, 2)
gives like this:
> a
Error: object 'a' not found
> b
Error: object 'b' not found
> c(a, b) %tin% c(1, 2)
> a
[1] 1
> b
[1] 2
this is not well tested though.
A potentially dangerous (in as much as using assign is risky) option would be to Vectorize assign:
assignVec <- Vectorize("assign",c("x","value"))
#.GlobalEnv is probably not what one wants in general; see below.
assignVec(c('a','b'),c(0,4),envir = .GlobalEnv)
a b
0 4
> b
[1] 4
> a
[1] 0
Or I suppose you could vectorize it yourself manually with your own function using mapply that maybe uses a sensible default for the envir argument. For instance, Vectorize will return a function with the same environment properties of assign, which in this case is namespace:base, or you could just set envir = parent.env(environment(assignVec)).
As others explained, there doesn't seem to be anything built in. ...but you could design a vassign function as follows:
vassign <- function(..., values, envir=parent.frame()) {
vars <- as.character(substitute(...()))
values <- rep(values, length.out=length(vars))
for(i in seq_along(vars)) {
assign(vars[[i]], values[[i]], envir)
}
}
# Then test it
vals <- 11:14
vassign(aa,bb,cc,dd, values=vals)
cc # 13
One thing to consider though is how to handle the cases where you e.g. specify 3 variables and 5 values or the other way around. Here I simply repeat (or truncate) the values to be of the same length as the variables. Maybe a warning would be prudent. But it allows the following:
vassign(aa,bb,cc,dd, values=0)
cc # 0
list2env(setNames(as.list(rep(2,5)), letters[1:5]), .GlobalEnv)
Served my purpose, i.e., assigning five 2s into first five letters.
Had a similar problem recently and here was my try using purrr::walk2
purrr::walk2(letters,1:26,assign,envir =parent.frame())
https://stat.ethz.ch/R-manual/R-devel/library/base/html/list2env.html:
list2env(
list(
a=1,
b=2:4,
c=rpois(10,10),
d=gl(3,4,LETTERS[9:11])
),
envir=.GlobalEnv
)
If your only requirement is to have a single line of code, then how about:
> a<-values[2]; b<-values[4]
I'm afraid that elegent solution you are looking for (like c(a, b) = c(2, 4)) unfortunatelly does not exist. But don't give up, I'm not sure! The nearest solution I can think of is this one:
attach(data.frame(a = 2, b = 4))
or if you are bothered with warnings, switch them off:
attach(data.frame(a = 2, b = 4), warn = F)
But I suppose you're not satisfied with this solution, I wouldn't be either...
R> values = c(1,2,3,4)
R> a <- values[2]; b <- values[3]; c <- values[4]
R> a
[1] 2
R> b
[1] 3
R> c
[1] 4
Another version with recursion:
let <- function(..., env = parent.frame()) {
f <- function(x, ..., i = 1) {
if(is.null(substitute(...))){
if(length(x) == 1)
x <- rep(x, i - 1);
stopifnot(length(x) == i - 1)
return(x);
}
val <- f(..., i = i + 1);
assign(deparse(substitute(x)), val[[i]], env = env);
return(val)
}
f(...)
}
example:
> let(a, b, 4:10)
[1] 4 5 6 7 8 9 10
> a
[1] 4
> b
[1] 5
> let(c, d, e, f, c(4, 3, 2, 1))
[1] 4 3 2 1
> c
[1] 4
> f
[1] 1
My version:
let <- function(x, value) {
mapply(
assign,
as.character(substitute(x)[-1]),
value,
MoreArgs = list(envir = parent.frame()))
invisible()
}
example:
> let(c(x, y), 1:2 + 3)
> x
[1] 4
> y
[1]
Combining some of the answers given here + a little bit of salt, how about this solution:
assignVec <- Vectorize("assign", c("x", "value"))
`%<<-%` <- function(x, value) invisible(assignVec(x, value, envir = .GlobalEnv))
c("a", "b") %<<-% c(2, 4)
a
## [1] 2
b
## [1] 4
I used this to add the R section here: http://rosettacode.org/wiki/Sort_three_variables#R
Caveat: It only works for assigning global variables (like <<-). If there is a better, more general solution, pls. tell me in the comments.
For a named list, use
list2env(mylist, environment())
For instance:
mylist <- list(foo = 1, bar = 2)
list2env(mylist, environment())
will add foo = 1, bar = 2 to the current environement, and override any object with those names. This is equivalent to
mylist <- list(foo = 1, bar = 2)
foo <- mylist$foo
bar <- mylist$bar
This works in a function, too:
f <- function(mylist) {
list2env(mylist, environment())
foo * bar
}
mylist <- list(foo = 1, bar = 2)
f(mylist)
However, it is good practice to name the elements you want to include in the current environment, lest you override another object... and so write preferrably
list2env(mylist[c("foo", "bar")], environment())
Finally, if you want different names for the new imported objects, write:
list2env(`names<-`(mylist[c"foo", "bar"]), c("foo2", "bar2")), environment())
which is equivalent to
foo2 <- mylist$foo
bar2 <- mylist$bar

Assign multiple new variables on LHS in a single line

I want to assign multiple variables in a single line in R. Is it possible to do something like this?
values # initialize some vector of values
(a, b) = values[c(2,4)] # assign a and b to values at 2 and 4 indices of 'values'
Typically I want to assign about 5-6 variables in a single line, instead of having multiple lines. Is there an alternative?
I put together an R package zeallot to tackle this very problem. zeallot includes an operator (%<-%) for unpacking, multiple, and destructuring assignment. The LHS of the assignment expression is built using calls to c(). The RHS of the assignment expression may be any expression which returns or is a vector, list, nested list, data frame, character string, date object, or custom objects (assuming there is a destructure implementation).
Here is the initial question reworked using zeallot (latest version, 0.0.5).
library(zeallot)
values <- c(1, 2, 3, 4) # initialize a vector of values
c(a, b) %<-% values[c(2, 4)] # assign `a` and `b`
a
#[1] 2
b
#[1] 4
For more examples and information one can check out the package vignette.
There is a great answer on the Struggling Through Problems Blog
This is taken from there, with very minor modifications.
USING THE FOLLOWING THREE FUNCTIONS
(Plus one for allowing for lists of different sizes)
# Generic form
'%=%' = function(l, r, ...) UseMethod('%=%')
# Binary Operator
'%=%.lbunch' = function(l, r, ...) {
Envir = as.environment(-1)
if (length(r) > length(l))
warning("RHS has more args than LHS. Only first", length(l), "used.")
if (length(l) > length(r)) {
warning("LHS has more args than RHS. RHS will be repeated.")
r <- extendToMatch(r, l)
}
for (II in 1:length(l)) {
do.call('<-', list(l[[II]], r[[II]]), envir=Envir)
}
}
# Used if LHS is larger than RHS
extendToMatch <- function(source, destin) {
s <- length(source)
d <- length(destin)
# Assume that destin is a length when it is a single number and source is not
if(d==1 && s>1 && !is.null(as.numeric(destin)))
d <- destin
dif <- d - s
if (dif > 0) {
source <- rep(source, ceiling(d/s))[1:d]
}
return (source)
}
# Grouping the left hand side
g = function(...) {
List = as.list(substitute(list(...)))[-1L]
class(List) = 'lbunch'
return(List)
}
Then to execute:
Group the left hand side using the new function g()
The right hand side should be a vector or a list
Use the newly-created binary operator %=%
# Example Call; Note the use of g() AND `%=%`
# Right-hand side can be a list or vector
g(a, b, c) %=% list("hello", 123, list("apples, oranges"))
g(d, e, f) %=% 101:103
# Results:
> a
[1] "hello"
> b
[1] 123
> c
[[1]]
[1] "apples, oranges"
> d
[1] 101
> e
[1] 102
> f
[1] 103
Example using lists of different sizes:
Longer Left Hand Side
g(x, y, z) %=% list("first", "second")
# Warning message:
# In `%=%.lbunch`(g(x, y, z), list("first", "second")) :
# LHS has more args than RHS. RHS will be repeated.
> x
[1] "first"
> y
[1] "second"
> z
[1] "first"
Longer Right Hand Side
g(j, k) %=% list("first", "second", "third")
# Warning message:
# In `%=%.lbunch`(g(j, k), list("first", "second", "third")) :
# RHS has more args than LHS. Only first2used.
> j
[1] "first"
> k
[1] "second"
Consider using functionality included in base R.
For instance, create a 1 row dataframe (say V) and initialize your variables in it. Now you can assign to multiple variables at once V[,c("a", "b")] <- values[c(2, 4)], call each one by name (V$a), or use many of them at the same time (values[c(5, 6)] <- V[,c("a", "b")]).
If you get lazy and don't want to go around calling variables from the dataframe, you could attach(V) (though I personally don't ever do it).
# Initialize values
values <- 1:100
# V for variables
V <- data.frame(a=NA, b=NA, c=NA, d=NA, e=NA)
# Assign elements from a vector
V[, c("a", "b", "e")] = values[c(2,4, 8)]
# Also other class
V[, "d"] <- "R"
# Use your variables
V$a
V$b
V$c # OOps, NA
V$d
V$e
here is my idea. Probably the syntax is quite simple:
`%tin%` <- function(x, y) {
mapply(assign, as.character(substitute(x)[-1]), y,
MoreArgs = list(envir = parent.frame()))
invisible()
}
c(a, b) %tin% c(1, 2)
gives like this:
> a
Error: object 'a' not found
> b
Error: object 'b' not found
> c(a, b) %tin% c(1, 2)
> a
[1] 1
> b
[1] 2
this is not well tested though.
A potentially dangerous (in as much as using assign is risky) option would be to Vectorize assign:
assignVec <- Vectorize("assign",c("x","value"))
#.GlobalEnv is probably not what one wants in general; see below.
assignVec(c('a','b'),c(0,4),envir = .GlobalEnv)
a b
0 4
> b
[1] 4
> a
[1] 0
Or I suppose you could vectorize it yourself manually with your own function using mapply that maybe uses a sensible default for the envir argument. For instance, Vectorize will return a function with the same environment properties of assign, which in this case is namespace:base, or you could just set envir = parent.env(environment(assignVec)).
As others explained, there doesn't seem to be anything built in. ...but you could design a vassign function as follows:
vassign <- function(..., values, envir=parent.frame()) {
vars <- as.character(substitute(...()))
values <- rep(values, length.out=length(vars))
for(i in seq_along(vars)) {
assign(vars[[i]], values[[i]], envir)
}
}
# Then test it
vals <- 11:14
vassign(aa,bb,cc,dd, values=vals)
cc # 13
One thing to consider though is how to handle the cases where you e.g. specify 3 variables and 5 values or the other way around. Here I simply repeat (or truncate) the values to be of the same length as the variables. Maybe a warning would be prudent. But it allows the following:
vassign(aa,bb,cc,dd, values=0)
cc # 0
list2env(setNames(as.list(rep(2,5)), letters[1:5]), .GlobalEnv)
Served my purpose, i.e., assigning five 2s into first five letters.
Had a similar problem recently and here was my try using purrr::walk2
purrr::walk2(letters,1:26,assign,envir =parent.frame())
https://stat.ethz.ch/R-manual/R-devel/library/base/html/list2env.html:
list2env(
list(
a=1,
b=2:4,
c=rpois(10,10),
d=gl(3,4,LETTERS[9:11])
),
envir=.GlobalEnv
)
If your only requirement is to have a single line of code, then how about:
> a<-values[2]; b<-values[4]
I'm afraid that elegent solution you are looking for (like c(a, b) = c(2, 4)) unfortunatelly does not exist. But don't give up, I'm not sure! The nearest solution I can think of is this one:
attach(data.frame(a = 2, b = 4))
or if you are bothered with warnings, switch them off:
attach(data.frame(a = 2, b = 4), warn = F)
But I suppose you're not satisfied with this solution, I wouldn't be either...
R> values = c(1,2,3,4)
R> a <- values[2]; b <- values[3]; c <- values[4]
R> a
[1] 2
R> b
[1] 3
R> c
[1] 4
Another version with recursion:
let <- function(..., env = parent.frame()) {
f <- function(x, ..., i = 1) {
if(is.null(substitute(...))){
if(length(x) == 1)
x <- rep(x, i - 1);
stopifnot(length(x) == i - 1)
return(x);
}
val <- f(..., i = i + 1);
assign(deparse(substitute(x)), val[[i]], env = env);
return(val)
}
f(...)
}
example:
> let(a, b, 4:10)
[1] 4 5 6 7 8 9 10
> a
[1] 4
> b
[1] 5
> let(c, d, e, f, c(4, 3, 2, 1))
[1] 4 3 2 1
> c
[1] 4
> f
[1] 1
My version:
let <- function(x, value) {
mapply(
assign,
as.character(substitute(x)[-1]),
value,
MoreArgs = list(envir = parent.frame()))
invisible()
}
example:
> let(c(x, y), 1:2 + 3)
> x
[1] 4
> y
[1]
Combining some of the answers given here + a little bit of salt, how about this solution:
assignVec <- Vectorize("assign", c("x", "value"))
`%<<-%` <- function(x, value) invisible(assignVec(x, value, envir = .GlobalEnv))
c("a", "b") %<<-% c(2, 4)
a
## [1] 2
b
## [1] 4
I used this to add the R section here: http://rosettacode.org/wiki/Sort_three_variables#R
Caveat: It only works for assigning global variables (like <<-). If there is a better, more general solution, pls. tell me in the comments.
For a named list, use
list2env(mylist, environment())
For instance:
mylist <- list(foo = 1, bar = 2)
list2env(mylist, environment())
will add foo = 1, bar = 2 to the current environement, and override any object with those names. This is equivalent to
mylist <- list(foo = 1, bar = 2)
foo <- mylist$foo
bar <- mylist$bar
This works in a function, too:
f <- function(mylist) {
list2env(mylist, environment())
foo * bar
}
mylist <- list(foo = 1, bar = 2)
f(mylist)
However, it is good practice to name the elements you want to include in the current environment, lest you override another object... and so write preferrably
list2env(mylist[c("foo", "bar")], environment())
Finally, if you want different names for the new imported objects, write:
list2env(`names<-`(mylist[c"foo", "bar"]), c("foo2", "bar2")), environment())
which is equivalent to
foo2 <- mylist$foo
bar2 <- mylist$bar

Check that a vector is contained in a matrix in R

I can't believe this is taking me this long to figure out, and I still can't figure it out.
I need to keep a collection of vectors, and later check that a certain vector is in that collection. I tried lists combined with %in% but that doesn't appear to work properly.
My next idea was to create a matrix and rbind vectors to it, but now I don't know how to check if a vector is contained in a matrix. %in appears to compare sets and not exact rows. Same appears to apply to intersect.
Help much appreciated!
Do you mean like this:
wantVec <- c(3,1,2)
myList <- list(A = c(1:3), B = c(3,1,2), C = c(2,3,1))
sapply(myList, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or, is the vector in the set?
any(sapply(myList, function(x, want) isTRUE(all.equal(x, want)), wantVec))
We can do a similar thing with a matrix:
myMat <- matrix(unlist(myList), ncol = 3, byrow = TRUE)
## As the vectors are now in the rows, we use apply over the rows
apply(myMat, 1, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or
any(apply(myMat, 1, function(x, want) isTRUE(all.equal(x, want)), wantVec))
Or by columns:
myMat2 <- matrix(unlist(myList), ncol = 3)
## As the vectors are now in the cols, we use apply over the cols
apply(myMat, 2, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or
any(apply(myMat, 2, function(x, want) isTRUE(all.equal(x, want)), wantVec))
If you need to do this a lot, write your own function
vecMatch <- function(x, want) {
isTRUE(all.equal(x, want))
}
And then use it, e.g. on the list myList:
> sapply(myList, vecMatch, wantVec)
A B C
FALSE TRUE FALSE
> any(sapply(myList, vecMatch, wantVec))
[1] TRUE
Or even wrap the whole thing:
vecMatch <- function(x, want) {
out <- sapply(x, function(x, want) isTRUE(all.equal(x, want)), want)
any(out)
}
> vecMatch(myList, wantVec)
[1] TRUE
> vecMatch(myList, 5:3)
[1] FALSE
EDIT: Quick comment on why I used isTRUE() wrapped around the all.equal() calls. This is due to the fact that where the two arguments are not equal, all.equal() doesn't return a logical value (FALSE):
> all.equal(1:3, c(3,2,1))
[1] "Mean relative difference: 1"
isTRUE() is useful here because it returns TRUE iff it's argument is TRUE, whilst it returns FALSE if it is anything else.
> M
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
v <- c(2, 5, 8)
check each column:
c1 <- which(M[, 1] == v[1])
c2 <- which(M[, 2] == v[2])
c3 <- which(M[, 3] == v[3])
Here is a way to still use intersect() on more than 2 elements
> intersect(intersect(c1, c2), c3)
[1] 2

How to assign from a function which returns more than one value?

Still trying to get into the R logic... what is the "best" way to unpack (on LHS) the results from a function returning multiple values?
I can't do this apparently:
R> functionReturningTwoValues <- function() { return(c(1, 2)) }
R> functionReturningTwoValues()
[1] 1 2
R> a, b <- functionReturningTwoValues()
Error: unexpected ',' in "a,"
R> c(a, b) <- functionReturningTwoValues()
Error in c(a, b) <- functionReturningTwoValues() : object 'a' not found
must I really do the following?
R> r <- functionReturningTwoValues()
R> a <- r[1]; b <- r[2]
or would the R programmer write something more like this:
R> functionReturningTwoValues <- function() {return(list(first=1, second=2))}
R> r <- functionReturningTwoValues()
R> r$first
[1] 1
R> r$second
[1] 2
--- edited to answer Shane's questions ---
I don't really need giving names to the result value parts. I am applying one aggregate function to the first component and an other to the second component (min and max. if it was the same function for both components I would not need splitting them).
(1) list[...]<- I had posted this over a decade ago on r-help. Since then it has been added to the gsubfn package. It does not require a special operator but does require that the left hand side be written using list[...] like this:
library(gsubfn) # need 0.7-0 or later
list[a, b] <- functionReturningTwoValues()
If you only need the first or second component these all work too:
list[a] <- functionReturningTwoValues()
list[a, ] <- functionReturningTwoValues()
list[, b] <- functionReturningTwoValues()
(Of course, if you only needed one value then functionReturningTwoValues()[[1]] or functionReturningTwoValues()[[2]] would be sufficient.)
See the cited r-help thread for more examples.
(2) with If the intent is merely to combine the multiple values subsequently and the return values are named then a simple alternative is to use with :
myfun <- function() list(a = 1, b = 2)
list[a, b] <- myfun()
a + b
# same
with(myfun(), a + b)
(3) attach Another alternative is attach:
attach(myfun())
a + b
ADDED: with and attach
I somehow stumbled on this clever hack on the internet ... I'm not sure if it's nasty or beautiful, but it lets you create a "magical" operator that allows you to unpack multiple return values into their own variable. The := function is defined here, and included below for posterity:
':=' <- function(lhs, rhs) {
frame <- parent.frame()
lhs <- as.list(substitute(lhs))
if (length(lhs) > 1)
lhs <- lhs[-1]
if (length(lhs) == 1) {
do.call(`=`, list(lhs[[1]], rhs), envir=frame)
return(invisible(NULL))
}
if (is.function(rhs) || is(rhs, 'formula'))
rhs <- list(rhs)
if (length(lhs) > length(rhs))
rhs <- c(rhs, rep(list(NULL), length(lhs) - length(rhs)))
for (i in 1:length(lhs))
do.call(`=`, list(lhs[[i]], rhs[[i]]), envir=frame)
return(invisible(NULL))
}
With that in hand, you can do what you're after:
functionReturningTwoValues <- function() {
return(list(1, matrix(0, 2, 2)))
}
c(a, b) := functionReturningTwoValues()
a
#[1] 1
b
# [,1] [,2]
# [1,] 0 0
# [2,] 0 0
I don't know how I feel about that. Perhaps you might find it helpful in your interactive workspace. Using it to build (re-)usable libraries (for mass consumption) might not be the best idea, but I guess that's up to you.
... you know what they say about responsibility and power ...
Usually I wrap the output into a list, which is very flexible (you can have any combination of numbers, strings, vectors, matrices, arrays, lists, objects int he output)
so like:
func2<-function(input) {
a<-input+1
b<-input+2
output<-list(a,b)
return(output)
}
output<-func2(5)
for (i in output) {
print(i)
}
[1] 6
[1] 7
I put together an R package zeallot to tackle this problem. zeallot includes a multiple assignment or unpacking assignment operator, %<-%. The LHS of the operator is any number of variables to assign, built using calls to c(). The RHS of the operator is a vector, list, data frame, date object, or any custom object with an implemented destructure method (see ?zeallot::destructure).
Here are a handful of examples based on the original post,
library(zeallot)
functionReturningTwoValues <- function() {
return(c(1, 2))
}
c(a, b) %<-% functionReturningTwoValues()
a # 1
b # 2
functionReturningListOfValues <- function() {
return(list(1, 2, 3))
}
c(d, e, f) %<-% functionReturningListOfValues()
d # 1
e # 2
f # 3
functionReturningNestedList <- function() {
return(list(1, list(2, 3)))
}
c(f, c(g, h)) %<-% functionReturningNestedList()
f # 1
g # 2
h # 3
functionReturningTooManyValues <- function() {
return(as.list(1:20))
}
c(i, j, ...rest) %<-% functionReturningTooManyValues()
i # 1
j # 2
rest # list(3, 4, 5, ..)
Check out the package vignette for more information and examples.
functionReturningTwoValues <- function() {
results <- list()
results$first <- 1
results$second <-2
return(results)
}
a <- functionReturningTwoValues()
I think this works.
There's no right answer to this question. I really depends on what you're doing with the data. In the simple example above, I would strongly suggest:
Keep things as simple as possible.
Wherever possible, it's a best practice to keep your functions vectorized. That provides the greatest amount of flexibility and speed in the long run.
Is it important that the values 1 and 2 above have names? In other words, why is it important in this example that 1 and 2 be named a and b, rather than just r[1] and r[2]? One important thing to understand in this context is that a and b are also both vectors of length 1. So you're not really changing anything in the process of making that assignment, other than having 2 new vectors that don't need subscripts to be referenced:
> r <- c(1,2)
> a <- r[1]
> b <- r[2]
> class(r)
[1] "numeric"
> class(a)
[1] "numeric"
> a
[1] 1
> a[1]
[1] 1
You can also assign the names to the original vector if you would rather reference the letter than the index:
> names(r) <- c("a","b")
> names(r)
[1] "a" "b"
> r["a"]
a
1
[Edit] Given that you will be applying min and max to each vector separately, I would suggest either using a matrix (if a and b will be the same length and the same data type) or data frame (if a and b will be the same length but can be different data types) or else use a list like in your last example (if they can be of differing lengths and data types).
> r <- data.frame(a=1:4, b=5:8)
> r
a b
1 1 5
2 2 6
3 3 7
4 4 8
> min(r$a)
[1] 1
> max(r$b)
[1] 8
If you want to return the output of your function to the Global Environment, you can use list2env, like in this example:
myfun <- function(x) { a <- 1:x
b <- 5:x
df <- data.frame(a=a, b=b)
newList <- list("my_obj1" = a, "my_obj2" = b, "myDF"=df)
list2env(newList ,.GlobalEnv)
}
myfun(3)
This function will create three objects in your Global Environment:
> my_obj1
[1] 1 2 3
> my_obj2
[1] 5 4 3
> myDF
a b
1 1 5
2 2 4
3 3 3
Lists seem perfect for this purpose. For example within the function you would have
x = desired_return_value_1 # (vector, matrix, etc)
y = desired_return_value_2 # (vector, matrix, etc)
returnlist = list(x,y...)
} # end of function
main program
x = returnlist[[1]]
y = returnlist[[2]]
Yes to your second and third questions -- that's what you need to do as you cannot have multiple 'lvalues' on the left of an assignment.
How about using assign?
functionReturningTwoValues <- function(a, b) {
assign(a, 1, pos=1)
assign(b, 2, pos=1)
}
You can pass the names of the variable you want to be passed by reference.
> functionReturningTwoValues('a', 'b')
> a
[1] 1
> b
[1] 2
If you need to access the existing values, the converse of assign is get.
[A]
If each of foo and bar is a single number, then there's nothing wrong with c(foo,bar); and you can also name the components: c(Foo=foo,Bar=bar). So you could access the components of the result 'res' as res[1], res[2]; or, in the named case, as res["Foo"], res["BAR"].
[B]
If foo and bar are vectors of the same type and length, then again there's nothing wrong with returning cbind(foo,bar) or rbind(foo,bar); likewise nameable. In the 'cbind' case, you would access foo and bar as res[,1], res[,2] or as res[,"Foo"], res[,"Bar"]. You might also prefer to return a dataframe rather than a matrix:
data.frame(Foo=foo,Bar=bar)
and access them as res$Foo, res$Bar. This would also work well if foo and bar were of the same length but not of the same type (e.g. foo is a vector of numbers, bar a vector of character strings).
[C]
If foo and bar are sufficiently different not to combine conveniently as above, then you shuld definitely return a list.
For example, your function might fit a linear model and
also calculate predicted values, so you could have
LM<-lm(....) ; foo<-summary(LM); bar<-LM$fit
and then you would return list(Foo=foo,Bar=bar) and then access the summary as res$Foo, the predicted values as res$Bar
source: http://r.789695.n4.nabble.com/How-to-return-multiple-values-in-a-function-td858528.html
Year 2021 and this is something I frequently use.
tidyverse package has a function called lst that assigns name to the list elements when creating the list.
Post which I use list2env() to assign variable or use the list directly
library(tidyverse)
fun <- function(){
a<-1
b<-2
lst(a,b)
}
list2env(fun(), envir=.GlobalEnv)#unpacks list key-values to variable-values into the current environment
This is only for the sake of completeness and not because I personally prefer it. You can pipe %>% the result, evaluate it with curly braces {} and write variables to the parent environment using double-arrow <<-.
library(tidyverse)
functionReturningTwoValues() %>% {a <<- .[1]; b <<- .[2]}
UPDATE:
Your can also use the multiple assignment operator from the zeallot package:: %<-%
c(a, b) %<-% list(0, 1)
I will post a function that returns multiple objects by way of vectors:
Median <- function(X){
X_Sort <- sort(X)
if (length(X)%%2==0){
Median <- (X_Sort[(length(X)/2)]+X_Sort[(length(X)/2)+1])/2
} else{
Median <- X_Sort[(length(X)+1)/2]
}
return(Median)
}
That was a function I created to calculate the median. I know that there's an inbuilt function in R called median() but nonetheless I programmed it to build other function to calculate the quartiles of a numeric data-set by using the Median() function I just programmed. The Median() function works like this:
If a numeric vector X has an even number of elements (i.e., length(X)%%2==0), the median is calculated by averaging the elements sort(X)[length(X)/2] and sort(X)[(length(X)/2+1)].
If Xdoesn't have an even number of elements, the median is sort(X)[(length(X)+1)/2].
On to the QuartilesFunction():
QuartilesFunction <- function(X){
X_Sort <- sort(X) # Data is sorted in ascending order
if (length(X)%%2==0){
# Data number is even
HalfDN <- X_Sort[1:(length(X)/2)]
HalfUP <- X_Sort[((length(X)/2)+1):length(X)]
QL <- Median(HalfDN)
QU <- Median(HalfUP)
QL1 <- QL
QL2 <- QL
QU1 <- QU
QU2 <- QU
QL3 <- QL
QU3 <- QU
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
} else{ # Data number is odd
# Including the median
Half1DN <- X_Sort[1:((length(X)+1)/2)]
Half1UP <- X_Sort[(((length(X)+1)/2)):length(X)]
QL1 <- Median(Half1DN)
QU1 <- Median(Half1UP)
# Not including the median
Half2DN <- X_Sort[1:(((length(X)+1)/2)-1)]
Half2UP <- X_Sort[(((length(X)+1)/2)+1):length(X)]
QL2 <- Median(Half2DN)
QU2 <- Median(Half2UP)
# Methods (1) and (2) averaged
QL3 <- (QL1+QL2)/2
QU3 <- (QU1+QU2)/2
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
}
return(Quartiles)
}
This function returns the quartiles of a numeric vector by using three methods:
Discarding the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Keeping the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Averaging the results obtained by using methods 1 and 2.
When the number of elements in the numeric vector X is even, the three methods coincide.
The result of the QuartilesFunction() is a vector that depicts the first and third quartiles calculated by using the three methods outlined.
With R 3.6.1, I can do the following
fr2v <- function() { c(5,3) }
a_b <- fr2v()
(a_b[[1]]) # prints "5"
(a_b[[2]]) # prints "3"
To obtain multiple outputs from a function and keep them in the desired format you can save the outputs to your hard disk (in the working directory) from within the function and then load them from outside the function:
myfun <- function(x) {
df1 <- ...
df2 <- ...
save(df1, file = "myfile1")
save(df2, file = "myfile2")
}
load("myfile1")
load("myfile2")

Resources