How to store different outputs inside a funcion? [duplicate] - r

This question already has answers here:
Returning multiple objects in an R function [duplicate]
(6 answers)
Closed 3 years ago.
I want to store different output variables that are calculated inside a function.
I coded a toy example:
f = function(number)
{
xx = NULL
savexx = NULL
savexx10 = NULL
for (i in 1:10) {
x = number*i
xx = c(xx,x)
}
save_phrase = "hello"
savexx = xx
savexx10 = xx*10
save = cbind(savexx,savexx10)
}
store = f(1)
store
But with this code it is returning only the variable save = cbind(savexx,savexx10).
I would like to save all the 4 variables that are created inside this function.
Is it possible doing this without using a dataframe or a list?

It is impossible without a list. List would be better than a data.frame because it can store different types of variables (vector, table, plot ect.) Try to do it like here:
f = function(number)
{
xx = NULL
savexx = NULL
savexx10 = NULL
for (i in 1:10) {
x = number*i
xx = c(xx,x)
}
lista <- list()
lista$save_phrase = "hello"
lista$savexx = xx
lista$savexx10 = xx*10
lista$save = cbind(lista$savexx, lista$savexx10)
lista
}
store = f(1)
# whole list:
store
# elements of a list:
store$save_phrase
store$savexx
store$savexx10
store$save

1) list We can return the desired variables in a list.
f2 = function(number) {
xx = NULL
savexx = NULL
savexx10 = NULL
for (i in 1:10) {
x = number*i
xx = c(xx,x)
}
list(save_phrase = "hello",
savexx = xx,
savexx10 = xx*10,
save = cbind(savexx,savexx10))
}
store = f2(1)
2) mget Another way to do this is to use mget if the returned variables have a pattern to their names as in this case:
f3 = function(number) {
xx = NULL
savexx = NULL
savexx10 = NULL
for (i in 1:10) {
x = number*i
xx = c(xx,x)
}
save_phrase = "hello"
savexx = xx
savexx10 = xx*10
save = cbind(savexx,savexx10)
mget(ls(pattern = "save"))
}
store = f3(1)
3) gsubfn gsubfn has a facility for placing the list components into separate variables. After this is run save_phrase, savexx, savexx10 and save will exists as separate variables.
library(gsubfn)
list[save_phrase, savexx, savexx10, save] <- f2(1)
4) attach Although this is not really recommended you can do this:
attach(f2(1), name = "f2")
This will create an entry on the search list with the variables that were returned so we can just refer to save_phrase, savexx, savexx10 and save. We can see the entry using search() and ls("f2") and we can remove the entry using detach("f2") .
5) assign Another possibility which is not really recommended but does work is to assign the components right into a specific environment. Now save_phrase, savexx, savexx10 and save will all exist in the global environment.
list2env(f2(1), .GlobalEnv)
Similarly this will inject those variables into the current environment. This is the same as the prior line if the current environment is the global environment.
list2env(f2(1), environment())
6) Again, I am not so sure this is a good idea but we could modify f to inject the outputs right into the parent frame. After this is run save_phrase, savexx, savexx10 and save will all exist in the current environment.
f4 = function(number, env = parent.frame()) {
xx = NULL
savexx = NULL
savexx10 = NULL
for (i in 1:10) {
x = number*i
xx = c(xx,x)
}
env$save_phrase = "hello"
env$savexx = xx
env$savexx10 = xx*10
env$save = cbind(savexx,savexx10)
invisible(env)
}
f4(1)

R functions only return a SINGLE object. If you want multiple objects returned they have to be combined into a list or some other type of object.
Some languages like python let us do stuff like this:
a, b = mult_return_func()
But R will only return a single object. R programmers typically use lists to return multiple objects.

If there is no return statement, then R will return the value of the last evaluated expression in the function.
This would explain why it is returning save = cbind(savexx,savexx10).
To return multiple values you will need a list or another object because the R return function can only return a single object.
My suggestion would be to add those values to a list, return the list, and then get the variables from the list.
I hope that helps. If you'd like to read more then I suggest going to https://www.datamentor.io/r-programming/return-function/

Related

R : How to create objects with a function which name and value depend on an argument, and that these objects are found in the global environment?

I have the following situation: I have different dataframes, I would like to be able, for each dataframe, to create 2 dataframes according to the value of one of the columns (log2FoldChange>1 and logFoldChange<-1).
For this I use the following code:
DJ29_T0_Overexpr = DJ29_T0[which(DJ29_T0$log2FoldChange > 1),]
DJ29_T0_Underexpr = DJ29_T0[which(DJ21_T0$log2FoldChange < -1),]
DJ229_T0 being one of my dataframe.
First problem: the sign for the dataframe where log2FoldChange < -1 is not taken into account.
But the main problem is at the time of making the function, I wrote the following:
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),])
assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),])
}
Which I then ran with :
spliteOverUnder(DJ29_T0)
No error message, but my objects are not exported in my global environment. I tried with return(paste(nm,"_Overexpr", sep="") but it only returns the object name but not the associated dataframe.
Using paste() forces the use of assign(), so I can't do :
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
paste(nm,"_Overexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > 1),]
paste(nm,"_Underexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) < -1),]
}
spliteOverUnder(DJ24_T0)
I encounter the following error:
Error in paste(nm, "_Overexpr", sep = "") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > :
could not find function "paste<-"
If you've encountered this difficulty before, I'd appreciate a little help.
And if you knew, once the function works, how to use a For loop going through a list containing all my dataframes to apply this function to each of them, I'm also a taker.
Thanks
When assigning, use the pos argument to hoist the new objects out of the function.
function(){
assign(x = ..., value = ...,
pos = 1 ## see below
)
}
... where 0 = the function's local environment, 1 = the environment next up (in which the function is defined) etc.
edit
A general function to create the split dataframes in your global environment follows. However, you might rather want to save the new dataframes (from within the function) or just forward them to downstream functions than cram your workspace with intermediary objects.
splitOverUnder <- function(the_name_of_the_frame){
df <- get(the_name_of_the_frame)
df$cat <- cut(df$log2FoldChange,
breaks = c(-Inf, -1, 1, Inf),
labels = c('underexpr', 'normal', 'overexpr')
)
split_data <- split(df, df$cat)
sapply(c('underexpr', 'overexpr'),
function(n){
new_df_name <- paste(the_name_of_the_frame, n, sep = '_')
assign(x = new_df_name,
value = split_data$n,
envir = .GlobalEnv
)
}
)
}
## say, df1 and df2 are your initial dataframes to split:
sapply(c('df1', 'df2'), function(n) splitOverUnder(n))

identical() but for environments/R6 in base R?

If I can run code before and after a user runs some code, how can I detect which variables were set or changed using base R? I can do this using identical() for non-environment objects. But is there a base-R solution for environments, including R6 classes?
Here's a solution using identical() which fails for envs/R6:
# Copy of initial vars
this_frame = sys.frame()
start_vars = ls()
start_copy = lapply(start_vars, get, envir = this_frame )
names(start_copy) = start_vars
# (user code here)
# Assess what's new and what's changed
end_vars = ls()
new_vars = end_vars[end_vars %in% start_vars == FALSE]
old_vars = end_vars[end_vars %in% start_vars == TRUE]
changed_vars = old_vars[sapply(old_vars, function(x) identical(get(x, envir = this_frame), start_copy[[x]])) == FALSE]
I'm writing a package that lets users run code in a separate session. I'd like to return only objects that were changed.
This solution detects changes in an environment, sub-environments, and R6-classes.
General approach
run start_state = env_as_list() on sys.frame()which stores everything in a list and recursively converts all environments/R6 and sub-environments/R6 to list.
Let the user manipulate stuff
Run end_state = env_as_list() and use identical() to detect changes between start_state and end_state.
env_as_list = function(env) {
rapply(
object = as.list(env, all.names = TRUE),
f = function(x) {
if ("R6" %in% class(x)) {
# R6 to list without recursion
x = as.list(x, all.names = TRUE)
x$.__enclos_env__$self = NULL
x$.__enclos_env__$super = NULL
env_as_list(x)
} else if (is.environment(x)) {
env_as_list(x)
} else {
stop("Impossible to get here")
}
},
classes = c("environment", "R6"),
how = "replace"
)
}
Demonstration
Let's test it: let's fill globalenv() with a some stuff to begin with:
R6_class = R6::R6Class("Testing", list(a = 1))
my_R6 = R6_class$new()
my_env = new.env()
my_env$sub_env = new.env()
my_env$sub_env$some_value = 2
my_regular = rnorm(5)
Snapshot time!
start_state = env_as_list(sys.frame())
Let the user play:
my_R6$a = 99 # Change R6
new_regular = 3 # new var
my_env$sub_env$some_value = 99 # Change sub-environment
Snapshot again!
end_state = env_as_list(sys.frame())
end_state$start_state = NULL # don't include this
Did nothing change?
> identical(start_state, end_state))
# FALSE
Which variables changed?
> is_same = lapply(names(end_state), function(x) identical(start_state[[x]], end_state[[x]]))
> names(end_state)[is_same == FALSE]
# "my_env" "new_regular" "my_R6"
Bonus
You can also use this to compute the size of an environment, including all R6 and sub-environments. Simply:
object.size(env_as_list(globalenv()))

Recursive manipulation of list elements in R

I have a nested list in the global environment of a R script.
anno <- list()
anno[['100']] <- list(
name = "PLACE",
color = "#a6cee3",
isDocumentAnnotation = T,
sublist = list()
)
person_sublist <- list()
person_sublist[['200']] <- list(
name = "ACTOR",
color = "#7fc97f",
isDocumentAnnotation = T,
sublist = list()
)
person_sublist[['300']] <- list(
name = "DIRECTOR",
color = "#beaed4",
isDocumentAnnotation = T,
sublist = list()
)
anno[['400']] <- list(
name = "PERSON",
color = "#1f78b4",
isDocumentAnnotation = T,
sublist = person_sublist
)
While running my process I interactively select elements via the id (100,200, ...). In return a want to add, delete or move elements in the list.
For this reason I thought of using a recursive function to navigate through the list:
searchListId <- function(parent_id = NULL, annotation_system = NULL)
{
for(id in names(annotation_system))
{
cat(paste(id,"\n"))
if(id == parent_id)
{
return(annotation_system[[id]]$sublist)
}
else
{
if(length(annotation_system[[id]]$sublist) > 0)
{
el <- searchListId(parent_id, annotation_system[[id]]$sublist)
if(!is.null(el))
return(el)
}
}
}
return(NULL)
}
searchListId('100', anno)
This functions returns the list() found in the sublist element of the matching element in the 'anno'-list. My problem is the global environment of R. If I manipulate something (delete, add, move something within the returned sublist) i need to reset the global variable with <<-. But in the case of a recursive function I only hold the current sublist in the context where the parent_id matches. How could one reference a global nested list in R while navigating though it via an recursive function? Is that even possible in R?
The calls I want to carry out in order to delete, add, or move elements in the list 'anno' are:
deleteListId('100', anno) #Should return the list without the element 100
addListId('400', anno) #Should return the list with a new element nested in '400'
switchListId('400','200', anno) #Should return a list where the elements with the according keys are switched.
The tricky part though is that I don't know how deep the recursive structure is. Normally I would use element references to manipulate them directly but how could a solution for manipulation of nested lists in R look like if I want to use recursion?
If possible, have the recursive function take a list, alter that, and return the new version. The reason I suggest this is because it's idiomatic R. R leans toward being a functional language, and part of that means state-based actions are discouraged. In general, functions should only modify state if that's all they do. For example, scale(x) doesn't affect the value stored in the x variable. But x <- scale(x) does, because the <- function (yes, it's a function) is meant to modify state.
Also, don't worry about memory unless you know it will be a problem based on past experience. Behind the scenes, R is pretty good at preventing needless copying, so trust it to do the right thing. This lets you work with simpler mental models.
A skeleton of how to recursively modify a list, without affecting the original:
anno <- list()
anno[['A1']] <- list(
sublist = list(
A3 = list(sublist = NULL),
A4 = list(sublist = list(A6 = list(sublist = NULL))),
A5 = list(sublist = NULL)
)
)
change_list <- function(x) {
for (i in seq_along(x)) {
value <- x[[i]]
if (is.list(value)) {
x[[i]] <- change_list(value)
} else {
if (is.null(value)) {
x[[i]] <- "this ws null"
}
}
}
x
}
change_list(anno)
# $A1
# $A1$sublist
# $A1$sublist$A3
# $A1$sublist$A3$sublist
# [1] "something new"
#
#
# $A1$sublist$A4
# $A1$sublist$A4$sublist
# $A1$sublist$A4$sublist$A6
# $A1$sublist$A4$sublist$A6$sublist
# [1] "something new"
#
#
#
#
# $A1$sublist$A5
# $A1$sublist$A5$sublist
# [1] "something new"
If you absolutely need to modify an item in the global namespace, use environments instead of lists.
anno_env <- new.env()
anno_env[["A1"]] <- new.env()
anno_env[["A1"]][["sublist"]] <- new.env()
anno_env[["A1"]][["sublist"]][["A3"]] <- NULL
anno_env[["A1"]][["sublist"]][["A4"]] <- NULL
change_environment <- function(environ) {
for (varname in ls(envir = environ)) {
value <- environ[[varname]]
if (is.environment(value)) {
change_environment(value)
} else {
environ[[varname]] <- "something new"
}
}
}
change_environment(anno_env)
anno_env[["A1"]][["sublist"]][["A3"]]
# [1] "something new"

How to use apply function with list of functions with multiple argument in r?

I repeat more than 10 functions, three or more times for each function in R!! it is very confusing and wasting my time. I understand the idea of apply function but very basic and need a help with this issue.
I have these functions (part of my whole functions):
sel_1 <- lower.tri(fam1) # selector for lower triangular matrix
if (check.pars & (any(fam1 != 0) | any(!is.na(par11)))) {
BiCopCheck(fam1[sel_1], par11[sel_1], par21[sel_1], call = match.call())
}
sel_2 <- lower.tri(fam2)
if (check.pars & (any(fam2 != 0) | any(!is.na(par11)))) {
BiCopCheck(fam2[sel_2], par12[sel_2], par22[sel_2], call = match.call())
}
sel_3 <- lower.tri(fam3)
if (check.pars & (any(fam3 != 0) | any(!is.na(par13)))) {
BiCopCheck(fam3[sel_3], par13[sel_3], par23[sel_3], call = match.call())
}
MixRVM1 <- list(Matrix = Matrix,
fam1 = fam1,
par11 = par11,
par21 = par21,
names = names,
MaxMat = MaxMat,
CondDistr = CondDistr)
MixRVM12 <- list(Matrix = Matrix,
fam2 = fam2,
par12 = par12,
par22 = par22,
names = names,
MaxMat = MaxMat,
CondDistr = CondDistr)
Is there an easy way to repeat these functions?
It's hard without the data, but by following these principles you should be able to improve your code:
if you don't already have your fam and par variables in a neat format (which you should if you have control over it):
fam_variables <- grep("fam[0-9]",ls(),value=TRUE)
fam_variables <- sel_variables[order(sapply(fam_variables,function(x){as.numeric(substr(x,4,nchar(x)))}))]
fam <- lapply(fam_variables,get) # assuming there's no missing sel variable from 1 to n!
par_list <- list(list(par11,par12,par13),list(par21,par22,par23))
Then you can use apply functions over these lists:
sel <- lapply(fam,lower.tri)
sapply(1:3,function(i){BiCopCheck(fam[[i]][sel[[i]]], par_list[[1]][[i]][sel[[i]]], par_list[[2]][[i]][sel[[i]]], call = match.call())})
MixRVM <- list() # we create a list, and we'll keep the same structure for every item (so the name will be the same among elements)
for (i in 1:2){
MixRVM[[i]] <- list(Matrix = Matrix,
fam = fam[[i]],
par1i = par_list[[1]][[i]],
par2i = par_list[[2]][[i]],
names = names,
MaxMat = MaxMat,
CondDistr = CondDistr)
}

R - Use names in a list to feed named objects to a loop?

I have a data frame of some 90 financial symbols (will use 3 for simplicity)
> View(syM)
symbol
1 APPL
2 YAHOO
3 IBM
I created a function that gets JSON data for these symbols and produce an output. Basically:
nX <- function(x) {
#get data for "x", format it, and store it in "nX"
nX <- x
return(nX)
}
I used a loop to get the data and store the zoo series named after each symbol accordingly.
for (i in 1:nrow(syM)) {
assign(x = paste0(syM[i,]),
value = nX(x = syM[i,]))
Sys.sleep(time = 1)
}
Which results in:
[1] "APPL" "YAHOO" "IBM"
Each is a zoo series with 5 columns of data.
Further, I want to get some plotting done to each series and output the result, preferably using a for loop or something better.
yN <- function(y) {
#plot "y" series, columns 2 and 3, and store it in "yN"
yN <- y[,2:3]
return(yN)
}
Following a similar logic to my previous loop I tried:
for (i in 1:nrow(syM)) {
assign(x = paste0(pairS[i,],".plot"),
value = yN(y = paste0(syM[i,])))
}
But so far the data is not being sent to the function, only the name of the symbol, so I naturally get:
y[,2:3] : incorrect number of dimensions
I have also tried:
for (i in 1:nrow(syM)) {
assign(x = paste0(syM[i,],".plot"),
value = yN(y = ls(pattern = paste0(syM[i,]))))
}
With similar results. When I input the name of the series manually it does save the plot of the first symbol as "APPL.Plot".
assign(paste0(syM[1,], ".Plot"),
value = yN(p = APPL))
Consider lapply with setNames to create a named list of nX returned objects:
nX_list <- setNames(lapply(syM$symbol, nX), syM$symbol)
# OUTPUT ZOO OBJECTS BY NAMED INDEX
nX_list$AAPL
nX_list$YAHOO
nX_list$IBM
# CREATE SEPARATE OBJECTS FROM LIST
# BUT NO NEED TO FLOOD GLOBAL ENVIR W/ 90 OBJECTS, JUST USE 1 LIST
list2env(nX_list, envir=.GlobalEnv)
For plot function, first add a get inside function to retrieve an object by its string name, then similarly run lapply with setNames:
yN <- function(y) {
#plot "y" series, columns 2 and 3, and store it in "yN"
yobj <- get(nX_list[[y]]) # IF USING ABOVE LIST
yobj <- get(y) # IF USING SEPARATE OBJECT
yN <- yobj[,2:3]
return(yN)
}
plot_list <- setNames(lapply(syM$symbol, yN), paste0(syM$symbol, ".plot"))
# OUTPUT PLOTS BY NAMED INDEX
plot_list$AAPL.plot
plot_list$YAHOO.plot
plot_list$IBM.plot
# CREATE SEPARATE OBJECTS FROM LIST
# BUT NO NEED TO FLOOD GLOBAL ENVIR W/ 90 OBJECTS, JUST USE 1 LIST
list2env(plot_list, envir=.GlobalEnv)
As you note, you're calling yN with a character argument in:
for (i in 1:nrow(syM)) {
assign(x = paste0(pairS[i,],".plot"),
value = yN(y = paste0(syM[i,])))
}
paste0(syM[i,]) is going to resolve to a character and not the zoo object it appears you're trying to reference. Instead, use something like get():
for (i in 1:nrow(syM)) {
assign(x = paste0(pairS[i,],".plot"),
value = yN(y = get(paste0(syM[i,]))))
}
Or perhaps just store your zoo objects in a list in the first place and then operate on all elements of the list with something like lapply()...

Resources