I have the following function:
example_Foo <- function( ...,FigureFolder){
# check what variables are passed through the function
v_names <- as.list(match.call())
variable_list <- v_names[2:(length(v_names)-2)]
# create file to store figures
subDir <- c(paste(FigureFolder,"SavedData",sep = "\\"))
}
Obviously this is just the start of the function, but I have already run into some problems. Here, I am trying to define the directory where I eventually want my results to be saved. An example of using the function is:
weight <- c(102,20,30,04,022,01,220,10)
height <- c(102,20,30,04,022,01,220,10)
catg <- c(102,20,30,04,022,01,220,10)
catg <- matrix(height,nrow = 2)
FigureFolder <- "C:\\exampleDat"
# this is the function
example_Foo(catg,FigureFolder)
This generates the following error:
Error in paste(FigureFolder, "SavedData", sep = "\\") :
argument "FigureFolder" is missing, with no default
which i'm guessing is due to the function not knowing what 'FigureFolder' is, my question is how do I pass this string through the function?
Because you do not use a named argument, the FigureFolder argument is put into .... Just use:
example_Foo(catg, FigureFolder = FigureFolder)
In addition:
example_Foo <- function( ...,FigureFolder){
# check what variables are passed through the function
v_names <- as.list(match.call())
variable_list <- v_names[2:(length(v_names)-2)]
# create file to store figures
subDir <- c(paste(FigureFolder,"SavedData",sep = "\\"))
}
could also be replaced by:
example_Foo <- function( ...,FigureFolder){
# check what variables are passed through the function
variable_list = list(...)
# create file to store figures
subDir <- c(paste(FigureFolder,"SavedData",sep = "\\"))
}
Or even simpler:
example_Foo <- function(variable_list, FigureFolder){
# create file to store figures
subDir <- c(paste(FigureFolder,"SavedData",sep = "\\"))
}
Keeping your code simple makes it easier to read (also for yourself) and easier to use and maintain.
You need to providea value for FigureFolder, eg
example_Foo(catg,FigureFolder="FigureFolder")
Related
I have the following situation: I have different dataframes, I would like to be able, for each dataframe, to create 2 dataframes according to the value of one of the columns (log2FoldChange>1 and logFoldChange<-1).
For this I use the following code:
DJ29_T0_Overexpr = DJ29_T0[which(DJ29_T0$log2FoldChange > 1),]
DJ29_T0_Underexpr = DJ29_T0[which(DJ21_T0$log2FoldChange < -1),]
DJ229_T0 being one of my dataframe.
First problem: the sign for the dataframe where log2FoldChange < -1 is not taken into account.
But the main problem is at the time of making the function, I wrote the following:
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),])
assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),])
}
Which I then ran with :
spliteOverUnder(DJ29_T0)
No error message, but my objects are not exported in my global environment. I tried with return(paste(nm,"_Overexpr", sep="") but it only returns the object name but not the associated dataframe.
Using paste() forces the use of assign(), so I can't do :
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
paste(nm,"_Overexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > 1),]
paste(nm,"_Underexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) < -1),]
}
spliteOverUnder(DJ24_T0)
I encounter the following error:
Error in paste(nm, "_Overexpr", sep = "") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > :
could not find function "paste<-"
If you've encountered this difficulty before, I'd appreciate a little help.
And if you knew, once the function works, how to use a For loop going through a list containing all my dataframes to apply this function to each of them, I'm also a taker.
Thanks
When assigning, use the pos argument to hoist the new objects out of the function.
function(){
assign(x = ..., value = ...,
pos = 1 ## see below
)
}
... where 0 = the function's local environment, 1 = the environment next up (in which the function is defined) etc.
edit
A general function to create the split dataframes in your global environment follows. However, you might rather want to save the new dataframes (from within the function) or just forward them to downstream functions than cram your workspace with intermediary objects.
splitOverUnder <- function(the_name_of_the_frame){
df <- get(the_name_of_the_frame)
df$cat <- cut(df$log2FoldChange,
breaks = c(-Inf, -1, 1, Inf),
labels = c('underexpr', 'normal', 'overexpr')
)
split_data <- split(df, df$cat)
sapply(c('underexpr', 'overexpr'),
function(n){
new_df_name <- paste(the_name_of_the_frame, n, sep = '_')
assign(x = new_df_name,
value = split_data$n,
envir = .GlobalEnv
)
}
)
}
## say, df1 and df2 are your initial dataframes to split:
sapply(c('df1', 'df2'), function(n) splitOverUnder(n))
I am using seurat to analyze some scRNAseq data, I have managed to put all the SCT integration one line codes from satijalab into a function with basically
SCT_normalization <- function (f1, f2) {
f_merge <- merge (f1, y=f2)
f.list <- SplitObject(f_merge, split.by = "stim")
f.list <- lapply(X = f.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(object.list = f.list, nfeatures = 3000)
f.list <<- PrepSCTIntegration(object.list = f.list, anchor.features = features)
return (f.list)
}
so that I will have f.list in the global environment for downstream analysis and making plots. The problem I am running into is that, every time I run the function, the output would be f.list, I want it to be specific to the input value name (i.e., f1 and/or f2). Basically something that I can set so that I would know which input value was used to generate the final output. I saw something using the assign function but someone wrote a warning about "the evil and wrong..." so I am not sure as to how to approach this.
From what it sounds like you don't need to use the super assign function <<-. In my opinion, I don't think <<- should be used as it can cause unexpected changes in objects. This is what I assume the other person was saying. For example, if you have the following function:
AverageVector <- function(v) x <<- mean(v, rm.na = TRUE)
Now you're trying to find the average of a vector you have, along with more analysis
library(tidyverse)
x <- unique(iris$Species)
avg_sl <- AverageVector(iris$Sepal.Length)
Now where x used to be a character vector, it's not a numeric vector with a length of 1.
So I would remove the <<- and call your function like this
object_list_1_2 <- SCT_normalize(object1, object2)
If you wanted a slightly more programatic way you could do something like this to keep track of objects you could do something like this:
SCT_normalization <- function(f1, f2) {
f_merge <- merge (f1, y = f2)
f.list <- SplitObject(f_merge, split.by = "stim")
f.list <- lapply(X = f.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(object.list = f.list, nfeatures = 3000)
f.list <- PrepSCTIntegration(object.list = f.list, anchor.features = features)
to_return <- list(inputs = list(f1, f2), normalized = f.list)
return(to_return)
}
I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}
How do I pass a pre-existing object, such as a DF, to my custom functions, given the setup I have (see below)?
Alternatively, do I need to set up my custom functions differently?
My functions reside in a series of *.R scripts.
I source the functions in my .Rprofile:
.env$fxShortName <- function(){
source("C:\\path\\to\\scriptFile.R")
}
Not-Quite Solutions:
1) Defining the Function Manually || It works w/ the obvious drawback that I need to manually load my functions each time.
2) Rscript + commandArgs || This works if I define the DF within the function, like this:
#foo.R
a <- data.frame(a = c(1))
b <- data.frame(b = c(1))
args <- commandArgs(trailingOnly = TRUE)
print(args)
data.frame.name <- args[1]
print(colnames(get(data.frame.name)))
Rscript creates a new R instance, though, so it doesn't see my pre-existing DF. At least, it doesn't find it out-of-the-box.
3) Function w/ substitute, match.call, etc. || I've adopted %>>% to set up auto-updated views of certain DFs, so I tried modifying the setup that works in that case. For %>>% I have this code in my .Rprofile:
.env$`%>>%` <- function(expr, x) {
x <- substitute(x)
call <- match.call()[-1]
fun <- function() {NULL}
body(fun) <- call$expr
makeActiveBinding(sym = deparse(x), fun = fun, env = parent.frame())
invisible(NULL)
}
This type of setup works with DFs from my current session. However, I prefer the structure offered by keeping my custom scripts separate from my .Rprofile.
4) get() & mget() || This seemed promising, but I don't understand it enough to definitively say whether or not it will help. And, yes, I did RTFM.
Reproducible Example:
myfx(head, preExistingDF)
Sample Code:
myfx <- function(expr, x) {
x <- substitute(x)
call <- match.call()[-1]
fun <- function() {NULL}
body(fun) <- call$expr
print(body(fun))}
Put the sample code in a script. Add the following code to your .Rprofile:
.env$mySamplefx <- function(){
source("C:\\path\\to\\myfx.R")
}
Then try it after adding the code directly to your .Rprofile.
I have loaded in a R console different type of objects.
I can remove them all using
rm(list=ls())
or remove only the functions (but not the variables) using
rm(list=lsf.str())
My question is:
is there a way to remove all variables except the functions
Here's a one-liner that removes all objects except for functions:
rm(list = setdiff(ls(), lsf.str()))
It uses setdiff to find the subset of objects in the global environment (as returned by ls()) that don't have mode function (as returned by lsf.str())
The posted setdiff answer is nice. I just thought I'd post this related function I wrote a while back. Its usefulness is up to the reader :-).
lstype<-function(type='closure'){
inlist<-ls(.GlobalEnv)
if (type=='function') type <-'closure'
typelist<-sapply(sapply(inlist,get),typeof)
return(names(typelist[typelist==type]))
}
You can use the following command to clear out ALL variables. Be careful because it you cannot get your variables back.
rm(list=ls(all=TRUE))
Here's a pretty convenient function I picked up somewhere and adjusted a little. Might be nice to keep in the directory.
list.objects <- function(env = .GlobalEnv)
{
if(!is.environment(env)){
env <- deparse(substitute(env))
stop(sprintf('"%s" must be an environment', env))
}
obj.type <- function(x) class(get(x, envir = env))
foo <- sapply(ls(envir = env), obj.type)
object.name <- names(foo)
names(foo) <- seq(length(foo))
dd <- data.frame(CLASS = foo, OBJECT = object.name,
stringsAsFactors = FALSE)
dd[order(dd$CLASS),]
}
> x <- 1:5
> d <- data.frame(x)
> list.objects()
# CLASS OBJECT
# 1 data.frame d
# 2 function list.objects
# 3 integer x
> list.objects(env = x)
# Error in list.objects(env = x) : "x" must be an environment
I wrote this to remove all objects apart from functions from the current environment (Programming language used is R with IDE R-Studio):
remove_list=c() # create a vector
for(i in 1:NROW(ls())){ # repeat over all objects in environment
if(class(get(ls()[i]))!="function"){ # if object is *not* a function
remove_list=c(remove_list,ls()[i]) # ..add to vector remove_list
}
}
rm(list=remove_list) # remove all objects named in remove_list
Notes-
The argument "list" in rm(list=) must be a character vector.
The name of an object in position i of the current environment is returned from ls()[i] and the object itself from get(ls()[i]). Therefore the class of an object is returned from class(get(ls()[i]))