Append value to empty vector in R? - r

I'm trying to learn R and I can't figure out how to append to a list.
If this were Python I would . . .
#Python
vector = []
values = ['a','b','c','d','e','f','g']
for i in range(0,len(values)):
vector.append(values[i])
How do you do this in R?
#R Programming
> vector = c()
> values = c('a','b','c','d','e','f','g')
> for (i in 1:length(values))
+ #append value[i] to empty vector

Appending to an object in a for loop causes the entire object to be copied on every iteration, which causes a lot of people to say "R is slow", or "R loops should be avoided".
As BrodieG mentioned in the comments: it is much better to pre-allocate a vector of the desired length, then set the element values in the loop.
Here are several ways to append values to a vector. All of them are discouraged.
Appending to a vector in a loop
# one way
for (i in 1:length(values))
vector[i] <- values[i]
# another way
for (i in 1:length(values))
vector <- c(vector, values[i])
# yet another way?!?
for (v in values)
vector <- c(vector, v)
# ... more ways
help("append") would have answered your question and saved the time it took you to write this question (but would have caused you to develop bad habits). ;-)
Note that vector <- c() isn't an empty vector; it's NULL. If you want an empty character vector, use vector <- character().
Pre-allocate the vector before looping
If you absolutely must use a for loop, you should pre-allocate the entire vector before the loop. This will be much faster than appending for larger vectors.
set.seed(21)
values <- sample(letters, 1e4, TRUE)
vector <- character(0)
# slow
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
# user system elapsed
# 0.340 0.000 0.343
vector <- character(length(values))
# fast(er)
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
# user system elapsed
# 0.024 0.000 0.023

FWIW: analogous to python's append():
b <- 1
b <- c(b, 2)

You have a few options:
c(vector, values)
append(vector, values)
vector[(length(vector) + 1):(length(vector) + length(values))] <- values
The first one is the standard approach. The second one gives you the option to append someplace other than the end. The last one is a bit contorted but has the advantage of modifying vector (though really, you could just as easily do vector <- c(vector, values).
Notice that in R you don't need to cycle through vectors. You can just operate on them in whole.
Also, this is fairly basic stuff, so you should go through some of the references.
Some more options based on OP feedback:
for(i in values) vector <- c(vector, i)

Just for the sake of completeness, appending values to a vector in a for loop is not really the philosophy in R. R works better by operating on vectors as a whole, as #BrodieG pointed out. See if your code can't be rewritten as:
ouput <- sapply(values, function(v) return(2*v))
Output will be a vector of return values. You can also use lapply if values is a list instead of a vector.

Sometimes we have to use loops, for example, when we don't know how many iterations we need to get the result. Take while loops as an example. Below are methods you absolutely should avoid:
a=numeric(0)
b=1
system.time(
{
while(b<=1e5){
b=b+1
a<-c(a,pi)
}
}
)
# user system elapsed
# 13.2 0.0 13.2
a=numeric(0)
b=1
system.time(
{
while(b<=1e5){
b=b+1
a<-append(a,pi)
}
}
)
# user system elapsed
# 11.06 5.72 16.84
These are very inefficient because R copies the vector every time it appends.
The most efficient way to append is to use index. Note that this time I let it iterate 1e7 times, but it's still much faster than c.
a=numeric(0)
system.time(
{
while(length(a)<1e7){
a[length(a)+1]=pi
}
}
)
# user system elapsed
# 5.71 0.39 6.12
This is acceptable. And we can make it a bit faster by replacing [ with [[.
a=numeric(0)
system.time(
{
while(length(a)<1e7){
a[[length(a)+1]]=pi
}
}
)
# user system elapsed
# 5.29 0.38 5.69
Maybe you already noticed that length can be time consuming. If we replace length with a counter:
a=numeric(0)
b=1
system.time(
{
while(b<=1e7){
a[[b]]=pi
b=b+1
}
}
)
# user system elapsed
# 3.35 0.41 3.76
As other users mentioned, pre-allocating the vector is very helpful. But this is a trade-off between speed and memory usage if you don't know how many loops you need to get the result.
a=rep(NaN,2*1e7)
b=1
system.time(
{
while(b<=1e7){
a[[b]]=pi
b=b+1
}
a=a[!is.na(a)]
}
)
# user system elapsed
# 1.57 0.06 1.63
An intermediate method is to gradually add blocks of results.
a=numeric(0)
b=0
step_count=0
step=1e6
system.time(
{
repeat{
a_step=rep(NaN,step)
for(i in seq_len(step)){
b=b+1
a_step[[i]]=pi
if(b>=1e7){
a_step=a_step[1:i]
break
}
}
a[(step_count*step+1):b]=a_step
if(b>=1e7) break
step_count=step_count+1
}
}
)
#user system elapsed
#1.71 0.17 1.89

In R, you can try out this way:
X = NULL
X
# NULL
values = letters[1:10]
values
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
X = append(X,values)
X
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
X = append(X,letters[23:26])
X
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "w" "x" "y" "z"

> vec <- c(letters[1:3]) # vec <- c("a","b","c") ; or just empty vector: vec <- c()
> values<- c(1,2,3)
> for (i in 1:length(values)){
print(paste("length of vec", length(vec)));
vec[length(vec)+1] <- values[i] #Appends value at the end of vector
}
[1] "length of vec 3"
[1] "length of vec 4"
[1] "length of vec 5"
> vec
[1] "a" "b" "c" "1" "2" "3"

What you're using in the python code is called a list in python, and it's tottaly different from R vectors, if i get what you wanna do:
# you can do like this if you'll put them manually
v <- c("a", "b", "c")
# if your values are in a list
v <- as.vector(your_list)
# if you just need to append
v <- append(v, value, after=length(v))

in R you create a "list" doing this:
v <- numeric() (is a numeric vector, or int in Python)
v <- character() (is a character vector, or str in Python)
then, if you want yo append a single value you have to do this:
v[1] <- 10 (append to vector "v", in position "1" a value 10)
v[2] <- 11 (append to vector "v", in position "2" a value 11)
So, if yoy want yo append multiple values in a for loop, try this:
v <- numeric()
for (value in 1:10) {
v[value] <- value
}
v
[1] 1 2 3 4 5 6 7 8 9 10

Related

Storing the values from IF loop in a vector

I am fetching bins.txt and saving its data in "data". I tried printing it and it is printing properly.
data <- read.csv("bins.txt", header = FALSE)
for (n in 1:24060)
{
j=(data[n,])
for (i in 1:20)
{
m=(i-1)*80
n=(i*80)-1
if(m<j && j<n)
{
print (i)
}
}
}
I wish to not print(i) but store the values of i in some vector and print it outside the loop and pass it in
obs="vector"
Somewhat like this
No idea what your bins.txt is. Since I really dislike nested loops, here's a suggestion:
(i) define the twenty pairs of min (or m) and max (or j) values in condition check:
m <- lapply(1:20, function(x) (x-1)*80)
n <- lapply(1:20, function(x) (x*80)-1)
(ii) return a list of twenty vectors based against data based on the twenty combinations of m and n:
lapply(1:20, function(x) dat[m[[x]] < dat & dat < n[[x]]])
Assuming that your data is
dat <- seq(0, 1000, length.out=50)
The first six vectors returned are:
[[1]]
[1] 20.40816 40.81633 61.22449
[[2]]
[1] 81.63265 102.04082 122.44898 142.85714
[[3]]
[1] 163.2653 183.6735 204.0816 224.4898
[[4]]
[1] 244.8980 265.3061 285.7143 306.1224
[[5]]
[1] 326.5306 346.9388 367.3469 387.7551
[[6]]
[1] 408.1633 428.5714 448.9796 469.3878

find all disjoint (non-overlapping) sets from a set of sets

My problem: need to find all disjoint (non-overlapping) sets from a set of sets.
Background: I am using comparative phylogenetic methods to study trait evolution in birds. I have a tree with ~300 species. This tree can be divided into subclades (i.e. subtrees). If two subclades do not share species, they are independent. I'm looking for an algorithm (and an R implementation if possible) that will find all possible subclade partitions where each subclade has greater than 10 taxa and all are independent. Each subclade can be considered a set and when two subclades are independent (do not share species) these subclades are then disjoint sets.
Hope this is clear and someone can help.
Cheers,
Glenn
The following code produces an example dataset. Where subclades is a list of all possible subclades (sets) from which I'd like to sample X disjoint sets, where the length of the set is Y.
###################################
# Example Dataset
###################################
library(ape)
library(phangorn)
library(TreeSim)
library(phytools)
##simulate a tree
n.taxa <- 300
tree <- sim.bd.taxa(n.taxa,1,lambda=.5,mu=0)[[1]][[1]]
tree$tip.label <- seq(n.taxa)
##extract all monophyletic subclades
get.all.subclades <- function(tree){
tmp <- vector("list")
nodes <- sort(unique(tree$edge[,1]))
i <- 282
for(i in 1:length(nodes)){
x <- Descendants(tree,nodes[i],type="tips")[[1]]
tmp[[i]] <- tree$tip.label[x]
}
tmp
}
tmp <- get.all.subclades(tree)
##set bounds on the maximum and mininum number of tips of the subclades to include
min.subclade.n.tip <- 10
max.subclade.n.tip <- 40
##function to replace trees of tip length exceeding max and min with NA
replace.trees <- function(x, min, max){
if(length(x) >= min & length(x)<= max) x else NA
}
#apply testNtip across all the subclades
tmp2 <- lapply(tmp, replace.trees, min = min.subclade.n.tip, max = max.subclade.n.tip)
##remove elements from list with NA, 
##all remaining elements are subclades with number of tips between
##min.subclade.n.tip and max.subclade.n.tip
subclades <- tmp2[!is.na(tmp2)]
names(subclades) <- seq(length(subclades))
Here's an example of how you might test each pair of list elements for zero overlap, extracting the indices of all non-overlapping pairs.
findDisjointPairs <- function(X) {
## Form a 2-column matrix enumerating all pairwise combos of X's elements
ij <- t(combn(length(X),2))
## A function that tests for zero overlap between a pair of vectors
areDisjoint <- function(i, j) length(intersect(X[[i]], X[[j]])) == 0
## Use mapply to test for overlap between each pair and extract indices
## of pairs with no matches
ij[mapply(areDisjoint, ij[,1], ij[,2]),]
}
## Make some reproducible data and test the function on it
set.seed(1)
A <- replicate(sample(letters, 5), n=5, simplify=FALSE)
findDisjointPairs(A)
# [,1] [,2]
# [1,] 1 2
# [2,] 1 4
# [3,] 1 5
Here are some functions that might be useful.
The first computes all possible disjoint collections of a list of sets.
I'm using "collection" instead of "partition" beacause a collection does not necessarily covers the universe (i. e., the union of all sets).
The algorithm is recursive, and only works for a small number of possible collections. This does not necessarily means that it won't work with a large list of sets, since the function removes the intersecting sets at every iteration.
If the code is not clear, please ask and I'll add comments.
The input must be a named list, and the result will be a list of collection, which is a character vector indicating the names of the sets.
DisjointCollectionsNotContainingX <- function(L, branch=character(0), x=numeric(0))
{
filter <- vapply(L, function(y) length(intersect(x, y))==0, logical(1))
L <- L[filter]
result <- list(branch)
for( i in seq_along(L) )
{
result <- c(result, Recall(L=L[-(1:i)], branch=c(branch, names(L)[i]), x=union(x, L[[i]])))
}
result
}
This is just a wrapper to hide auxiliary arguments:
DisjointCollections <- function(L) DisjointCollectionsNotContainingX(L=L)
The next function can be used to validade a given list of collections supposedly non-overlapping and "maximal".
For every collection, it will test if
1. all sets are effectively disjoint and
2. adding another set either results in a non-disjoint collection or an existing collection:
ValidateDC <- function(L, DC)
{
for( collection in DC )
{
for( i in seq_along(collection) )
{
others <- Reduce(f=union, x=L[collection[-i]])
if( length(intersect(L[collection[i]], others)) > 0 ) return(FALSE)
}
elements <- Reduce(f=union, x=L[collection])
for( k in seq_along(L) ) if( ! (names(L)[k] %in% collection) )
{
if( length(intersect(elements, L[[k]])) == 0 )
{
check <- vapply(DC, function(z) setequal(c(collection, names(L)[k]), z), logical(1))
if( ! any(check) ) return(FALSE)
}
}
}
TRUE
}
Example:
L <- list(A=c(1,2,3), B=c(3,4), C=c(5,6), D=c(6,7,8))
> ValidateDC(L,DisjointCollections(L))
[1] TRUE
> DisjointCollections(L)
[[1]]
character(0)
[[2]]
[1] "A"
[[3]]
[1] "A" "C"
[[4]]
[1] "A" "D"
[[5]]
[1] "B"
[[6]]
[1] "B" "C"
[[7]]
[1] "B" "D"
[[8]]
[1] "C"
[[9]]
[1] "D"
Note that the collections containing A and B simultaneously do not show up, due to their non-null intersection. Also, collections with C and D simultaneously don't appear. Others are OK.
Note: the empty collection character(0) is always a valid combination.
After creating all possible disjoint collections, you can apply any filters you want to proceed.
EDIT:
I've removed the line if( length(L)==0 ) return(list(branch)) from the first function; it's not needed.
Performance: If there is considerable overlapping among sets, the function runs fast. Example:
set.seed(1)
L <- lapply(1:50, function(.)sample(x=1200, size=20))
names(L) <- c(LETTERS, letters)[1:50]
system.time(DC <- DisjointCollections(L))
Result:
# user system elapsed
# 9.91 0.00 9.92
Total number of collections found:
> length(DC)
[1] 121791

Convenience function for # elements in data.frame, matrix, vector?

Is there a built-in convenience function that returns the number of elements in a data.frame, matrix, or vector? length( matrix ) and length( vector ) work, but length( data.frame ) returns the number of columns. prod( dim( vector ) ) returns 1 always, but works fine with matrix/data.frame. I'm looking for a single function that works for all three.
I don't think one already exists, so just write your own. You should only need 2 cases, 1) lists, 2) arrays:
elements <- function(x) {
if(is.list(x)) {
do.call(sum,lapply(x, elements))
} else {
length(x)
}
}
d <- data.frame(1:10, letters[1:10])
m <- as.matrix(d)
v <- d[,1]
l <- c(d, list(1:5))
L <- list(l, list(1:10))
elements(d) # data.frame
# [1] 20
elements(m) # matrix
# [1] 20
elements(v) # vector
# [1] 10
elements(l) # list
# [1] 25
elements(L) # list of lists
# [1] 35
What about length(unlist(whatever))?
(Note: I just wanted to reply that there's no such function, but suddenly I recalled I just used unlist 30 minutes ago, and that it can be applied to get easy solution! What a coincidence...)
My personal 'convenience function' for this is:
Rgames: lssize
function(items){
sizes<-sapply(sapply(sapply(sapply(items,get,simplify=F),unlist,simplify=F),as.vector,simplify=F),length)
return(sizes)
}
It works on every 'typeof' variable I could think of. FWIW, it's part of my toolkit which includes the useful "find only one type of variable in my workspace" :
Rgames: lstype
function(type='closure'){
inlist<-ls(.GlobalEnv)
if (type=='function') type <-'closure'
typelist<-sapply(sapply(inlist,get),typeof)
return(names(typelist[typelist==type]))
}

compare adjacent elements of the same vector (avoiding loops)

I managed to write a for loop to compare letters in the following vector:
bases <- c("G","C","A","T")
test <- sample(bases, replace=T, 20)
test will return
[1] "T" "G" "T" "G" "C" "A" "A" "G" "A" "C" "A" "T" "T" "T" "T" "C" "A" "G" "G" "C"
with the function Comp() I can check if a letter is matching to the next letter
Comp <- function(data)
{
output <- vector()
for(i in 1:(length(data)-1))
{
if(data[i]==data[i+1])
{
output[i] <-1
}
else
{
output[i] <-0
}
}
return(output)
}
Resulting in;
> Comp(test)
[1] 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 1 0
This is working, however its verry slow with large numbers. Therefor i tried sapply()
Comp <- function(x,i) if(x[i]==x[i+1]) 1 else 0
unlist(lapply(test, Comp, test))
Unfortunately its not working... (Error in i + 1 : non-numeric argument to binary operator) I have trouble figuring out how to access the preceding letter in the vector to compare it. Also the length(data)-1, to "not compare" the last letter might become a problem.
Thank you all for the help!
Cheers
Lucky
Just "lag" test and use ==, which is vectorized.
bases <- c("G","C","A","T")
set.seed(21)
test <- sample(bases, replace=TRUE, 20)
lag.test <- c(tail(test,-1),NA)
#lag.test <- c(NA,head(test,-1))
test == lag.test
Update:
Also, your Comp function is slow because you don't specify the length of output when you initialize it. I suspect you were trying to pre-allocate, but vector() creates a zero-length vector that must be expanded during every iteration of your loop. Your Comp function is significantly faster if you change the call to vector() to vector(length=NROW(data)-1).
set.seed(21)
test <- sample(bases, replace=T, 1e5)
system.time(orig <- Comp(test))
# user system elapsed
# 34.760 0.010 34.884
system.time(prealloc <- Comp.prealloc(test))
# user system elapsed
# 1.18 0.00 1.19
identical(orig, prealloc)
# [1] TRUE
As #Joshua wrote, you should of course use vectorization - it is way more efficient.
...But just for reference, your Comp function can still be optimized a bit.
The result of a comparison is TRUE/FALSE which is glorified versions of 1/0. Also, ensuring the result is integer instead of numeric consumes half the memory.
Comp.opt <- function(data)
{
output <- integer(length(data)-1L)
for(i in seq_along(output))
{
output[[i]] <- (data[[i]]==data[[i+1L]])
}
return(output)
}
...and the speed difference:
> system.time(orig <- Comp(test))
user system elapsed
21.10 0.00 21.11
> system.time(prealloc <- Comp.prealloc(test))
user system elapsed
0.49 0.00 0.49
> system.time(opt <- Comp.opt(test))
user system elapsed
0.41 0.00 0.40
> all.equal(opt, orig) # opt is integer, orig is double
[1] TRUE
Have a look at this :
> x = c("T", "G", "T", "G", "G","T","T","T")
>
> res = sequence(rle(x)$lengths)-1
>
> dt = data.frame(x,res)
>
> dt
x res
1 T 0
2 G 0
3 T 0
4 G 0
5 G 1
6 T 0
7 T 1
8 T 2
Might work faster.

How to assign from a function which returns more than one value?

Still trying to get into the R logic... what is the "best" way to unpack (on LHS) the results from a function returning multiple values?
I can't do this apparently:
R> functionReturningTwoValues <- function() { return(c(1, 2)) }
R> functionReturningTwoValues()
[1] 1 2
R> a, b <- functionReturningTwoValues()
Error: unexpected ',' in "a,"
R> c(a, b) <- functionReturningTwoValues()
Error in c(a, b) <- functionReturningTwoValues() : object 'a' not found
must I really do the following?
R> r <- functionReturningTwoValues()
R> a <- r[1]; b <- r[2]
or would the R programmer write something more like this:
R> functionReturningTwoValues <- function() {return(list(first=1, second=2))}
R> r <- functionReturningTwoValues()
R> r$first
[1] 1
R> r$second
[1] 2
--- edited to answer Shane's questions ---
I don't really need giving names to the result value parts. I am applying one aggregate function to the first component and an other to the second component (min and max. if it was the same function for both components I would not need splitting them).
(1) list[...]<- I had posted this over a decade ago on r-help. Since then it has been added to the gsubfn package. It does not require a special operator but does require that the left hand side be written using list[...] like this:
library(gsubfn) # need 0.7-0 or later
list[a, b] <- functionReturningTwoValues()
If you only need the first or second component these all work too:
list[a] <- functionReturningTwoValues()
list[a, ] <- functionReturningTwoValues()
list[, b] <- functionReturningTwoValues()
(Of course, if you only needed one value then functionReturningTwoValues()[[1]] or functionReturningTwoValues()[[2]] would be sufficient.)
See the cited r-help thread for more examples.
(2) with If the intent is merely to combine the multiple values subsequently and the return values are named then a simple alternative is to use with :
myfun <- function() list(a = 1, b = 2)
list[a, b] <- myfun()
a + b
# same
with(myfun(), a + b)
(3) attach Another alternative is attach:
attach(myfun())
a + b
ADDED: with and attach
I somehow stumbled on this clever hack on the internet ... I'm not sure if it's nasty or beautiful, but it lets you create a "magical" operator that allows you to unpack multiple return values into their own variable. The := function is defined here, and included below for posterity:
':=' <- function(lhs, rhs) {
frame <- parent.frame()
lhs <- as.list(substitute(lhs))
if (length(lhs) > 1)
lhs <- lhs[-1]
if (length(lhs) == 1) {
do.call(`=`, list(lhs[[1]], rhs), envir=frame)
return(invisible(NULL))
}
if (is.function(rhs) || is(rhs, 'formula'))
rhs <- list(rhs)
if (length(lhs) > length(rhs))
rhs <- c(rhs, rep(list(NULL), length(lhs) - length(rhs)))
for (i in 1:length(lhs))
do.call(`=`, list(lhs[[i]], rhs[[i]]), envir=frame)
return(invisible(NULL))
}
With that in hand, you can do what you're after:
functionReturningTwoValues <- function() {
return(list(1, matrix(0, 2, 2)))
}
c(a, b) := functionReturningTwoValues()
a
#[1] 1
b
# [,1] [,2]
# [1,] 0 0
# [2,] 0 0
I don't know how I feel about that. Perhaps you might find it helpful in your interactive workspace. Using it to build (re-)usable libraries (for mass consumption) might not be the best idea, but I guess that's up to you.
... you know what they say about responsibility and power ...
Usually I wrap the output into a list, which is very flexible (you can have any combination of numbers, strings, vectors, matrices, arrays, lists, objects int he output)
so like:
func2<-function(input) {
a<-input+1
b<-input+2
output<-list(a,b)
return(output)
}
output<-func2(5)
for (i in output) {
print(i)
}
[1] 6
[1] 7
I put together an R package zeallot to tackle this problem. zeallot includes a multiple assignment or unpacking assignment operator, %<-%. The LHS of the operator is any number of variables to assign, built using calls to c(). The RHS of the operator is a vector, list, data frame, date object, or any custom object with an implemented destructure method (see ?zeallot::destructure).
Here are a handful of examples based on the original post,
library(zeallot)
functionReturningTwoValues <- function() {
return(c(1, 2))
}
c(a, b) %<-% functionReturningTwoValues()
a # 1
b # 2
functionReturningListOfValues <- function() {
return(list(1, 2, 3))
}
c(d, e, f) %<-% functionReturningListOfValues()
d # 1
e # 2
f # 3
functionReturningNestedList <- function() {
return(list(1, list(2, 3)))
}
c(f, c(g, h)) %<-% functionReturningNestedList()
f # 1
g # 2
h # 3
functionReturningTooManyValues <- function() {
return(as.list(1:20))
}
c(i, j, ...rest) %<-% functionReturningTooManyValues()
i # 1
j # 2
rest # list(3, 4, 5, ..)
Check out the package vignette for more information and examples.
functionReturningTwoValues <- function() {
results <- list()
results$first <- 1
results$second <-2
return(results)
}
a <- functionReturningTwoValues()
I think this works.
There's no right answer to this question. I really depends on what you're doing with the data. In the simple example above, I would strongly suggest:
Keep things as simple as possible.
Wherever possible, it's a best practice to keep your functions vectorized. That provides the greatest amount of flexibility and speed in the long run.
Is it important that the values 1 and 2 above have names? In other words, why is it important in this example that 1 and 2 be named a and b, rather than just r[1] and r[2]? One important thing to understand in this context is that a and b are also both vectors of length 1. So you're not really changing anything in the process of making that assignment, other than having 2 new vectors that don't need subscripts to be referenced:
> r <- c(1,2)
> a <- r[1]
> b <- r[2]
> class(r)
[1] "numeric"
> class(a)
[1] "numeric"
> a
[1] 1
> a[1]
[1] 1
You can also assign the names to the original vector if you would rather reference the letter than the index:
> names(r) <- c("a","b")
> names(r)
[1] "a" "b"
> r["a"]
a
1
[Edit] Given that you will be applying min and max to each vector separately, I would suggest either using a matrix (if a and b will be the same length and the same data type) or data frame (if a and b will be the same length but can be different data types) or else use a list like in your last example (if they can be of differing lengths and data types).
> r <- data.frame(a=1:4, b=5:8)
> r
a b
1 1 5
2 2 6
3 3 7
4 4 8
> min(r$a)
[1] 1
> max(r$b)
[1] 8
If you want to return the output of your function to the Global Environment, you can use list2env, like in this example:
myfun <- function(x) { a <- 1:x
b <- 5:x
df <- data.frame(a=a, b=b)
newList <- list("my_obj1" = a, "my_obj2" = b, "myDF"=df)
list2env(newList ,.GlobalEnv)
}
myfun(3)
This function will create three objects in your Global Environment:
> my_obj1
[1] 1 2 3
> my_obj2
[1] 5 4 3
> myDF
a b
1 1 5
2 2 4
3 3 3
Lists seem perfect for this purpose. For example within the function you would have
x = desired_return_value_1 # (vector, matrix, etc)
y = desired_return_value_2 # (vector, matrix, etc)
returnlist = list(x,y...)
} # end of function
main program
x = returnlist[[1]]
y = returnlist[[2]]
Yes to your second and third questions -- that's what you need to do as you cannot have multiple 'lvalues' on the left of an assignment.
How about using assign?
functionReturningTwoValues <- function(a, b) {
assign(a, 1, pos=1)
assign(b, 2, pos=1)
}
You can pass the names of the variable you want to be passed by reference.
> functionReturningTwoValues('a', 'b')
> a
[1] 1
> b
[1] 2
If you need to access the existing values, the converse of assign is get.
[A]
If each of foo and bar is a single number, then there's nothing wrong with c(foo,bar); and you can also name the components: c(Foo=foo,Bar=bar). So you could access the components of the result 'res' as res[1], res[2]; or, in the named case, as res["Foo"], res["BAR"].
[B]
If foo and bar are vectors of the same type and length, then again there's nothing wrong with returning cbind(foo,bar) or rbind(foo,bar); likewise nameable. In the 'cbind' case, you would access foo and bar as res[,1], res[,2] or as res[,"Foo"], res[,"Bar"]. You might also prefer to return a dataframe rather than a matrix:
data.frame(Foo=foo,Bar=bar)
and access them as res$Foo, res$Bar. This would also work well if foo and bar were of the same length but not of the same type (e.g. foo is a vector of numbers, bar a vector of character strings).
[C]
If foo and bar are sufficiently different not to combine conveniently as above, then you shuld definitely return a list.
For example, your function might fit a linear model and
also calculate predicted values, so you could have
LM<-lm(....) ; foo<-summary(LM); bar<-LM$fit
and then you would return list(Foo=foo,Bar=bar) and then access the summary as res$Foo, the predicted values as res$Bar
source: http://r.789695.n4.nabble.com/How-to-return-multiple-values-in-a-function-td858528.html
Year 2021 and this is something I frequently use.
tidyverse package has a function called lst that assigns name to the list elements when creating the list.
Post which I use list2env() to assign variable or use the list directly
library(tidyverse)
fun <- function(){
a<-1
b<-2
lst(a,b)
}
list2env(fun(), envir=.GlobalEnv)#unpacks list key-values to variable-values into the current environment
This is only for the sake of completeness and not because I personally prefer it. You can pipe %>% the result, evaluate it with curly braces {} and write variables to the parent environment using double-arrow <<-.
library(tidyverse)
functionReturningTwoValues() %>% {a <<- .[1]; b <<- .[2]}
UPDATE:
Your can also use the multiple assignment operator from the zeallot package:: %<-%
c(a, b) %<-% list(0, 1)
I will post a function that returns multiple objects by way of vectors:
Median <- function(X){
X_Sort <- sort(X)
if (length(X)%%2==0){
Median <- (X_Sort[(length(X)/2)]+X_Sort[(length(X)/2)+1])/2
} else{
Median <- X_Sort[(length(X)+1)/2]
}
return(Median)
}
That was a function I created to calculate the median. I know that there's an inbuilt function in R called median() but nonetheless I programmed it to build other function to calculate the quartiles of a numeric data-set by using the Median() function I just programmed. The Median() function works like this:
If a numeric vector X has an even number of elements (i.e., length(X)%%2==0), the median is calculated by averaging the elements sort(X)[length(X)/2] and sort(X)[(length(X)/2+1)].
If Xdoesn't have an even number of elements, the median is sort(X)[(length(X)+1)/2].
On to the QuartilesFunction():
QuartilesFunction <- function(X){
X_Sort <- sort(X) # Data is sorted in ascending order
if (length(X)%%2==0){
# Data number is even
HalfDN <- X_Sort[1:(length(X)/2)]
HalfUP <- X_Sort[((length(X)/2)+1):length(X)]
QL <- Median(HalfDN)
QU <- Median(HalfUP)
QL1 <- QL
QL2 <- QL
QU1 <- QU
QU2 <- QU
QL3 <- QL
QU3 <- QU
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
} else{ # Data number is odd
# Including the median
Half1DN <- X_Sort[1:((length(X)+1)/2)]
Half1UP <- X_Sort[(((length(X)+1)/2)):length(X)]
QL1 <- Median(Half1DN)
QU1 <- Median(Half1UP)
# Not including the median
Half2DN <- X_Sort[1:(((length(X)+1)/2)-1)]
Half2UP <- X_Sort[(((length(X)+1)/2)+1):length(X)]
QL2 <- Median(Half2DN)
QU2 <- Median(Half2UP)
# Methods (1) and (2) averaged
QL3 <- (QL1+QL2)/2
QU3 <- (QU1+QU2)/2
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
}
return(Quartiles)
}
This function returns the quartiles of a numeric vector by using three methods:
Discarding the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Keeping the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Averaging the results obtained by using methods 1 and 2.
When the number of elements in the numeric vector X is even, the three methods coincide.
The result of the QuartilesFunction() is a vector that depicts the first and third quartiles calculated by using the three methods outlined.
With R 3.6.1, I can do the following
fr2v <- function() { c(5,3) }
a_b <- fr2v()
(a_b[[1]]) # prints "5"
(a_b[[2]]) # prints "3"
To obtain multiple outputs from a function and keep them in the desired format you can save the outputs to your hard disk (in the working directory) from within the function and then load them from outside the function:
myfun <- function(x) {
df1 <- ...
df2 <- ...
save(df1, file = "myfile1")
save(df2, file = "myfile2")
}
load("myfile1")
load("myfile2")

Resources