how to pass unknown argument to a function - r

I'm assigning a data frame to a variable name taken from a string.
So when I run the code I don't know what the variable name will be.
I want to pass that data frame to another function to plot it. How can I pass it to the function without knowing its name?
file_name <- file.choose()
fname <- unlist (strsplit (file_name, "\\", fixed = TRUE))
fname <- fname[length(fname)]
waf_no <- unlist (strsplit (fname, "\\s"))
waf_no <- waf_no[grep(waf_no, pattern="WAF")]
data <- read_WAF_file (file_name)
assign(waf_no, flux_calc(data)) #flux calc() calculates and manipulates the data frame
plot_waf(?)
my plot_waf function is very simple
plot_waf <- function (dataframe) {
library("ggplot2")
qplot(dist,n2o,data=dataframe,shape=treat)
}

The inverse for assign is get:
Search by name for an object (get) or zero or more objects (mget).
Therefore, you'll need to run your plot function like this:
plot_waf(get(waf_no))

Related

Use vector of strings representing variables generated by a for loop as variable in function call

I would like to call a function() in R using variable names out of a string vector. In my specific case I used a for loop to generate numbered variable names. For example:
for (i in 1:3) {
assign(paste0("x",i),"value")
}
v <- paste0("x",c(1,2,3))
function(v) #Here I would like to call function with variables defined in for loop.
Background:
I have a data frame, which I filter by different criteria (controls, cases). The resulting vectors are turned into Grange objects and saved as variable or saved as a string vector of directories. Then I would like to call IDRfilterN3() using the same number of Grange objects and paths. This function expects 3 Grange objects and 3 paths.
Here the specific example:
for (i in unique(samples$Treatment)){
ind <- samples$Treatment == i
for (j in samples$Replicate[ind]){
assign(paste0("peaks_",i,"_",j),toGRanges(file.path(samples$Peaks[ind][j]),format = "MACS2"))
assign(paste0("bam_",i,"_",j),file.path(samples$bamReads[ind][j]))
}
if(length(samples$Replicate[ind])==3) assign(paste0("peaks_",i,".idr"),IDRfilterN3(paste0("peaks_",i,"_",c(1,2,3)),paste0("bam_",i,"_",c(1,2,3))))
else print("Sample size not applicable. IDR failed!")
}
Here an example of a sample sheet and the steps I need to perform:
ids <- paste0("ID",c(1,2,3,4))
condition <- c(rep("A",2),rep("B",2))
replicate <- c(1,2,1,2)
pathA <- paste0("/dir/sample_",c(1,2,3,4))
pathB <- paste0("/dir/sample_",c(1,2,3,4),".bam")
d <- data.frame(ids,condition,replicate,pathA,pathB)
#Generate variable storing Grange object out of pathA path for each condition and replicate:
peakA1 <- toGRanges("/dir/sample_1")
peakA2 <- toGRanges("/dir/sample_2")
peakB1 <- toGRanges("/dir/sample_3")
peakB2 <- toGRanges("/dir/sample_4")
#Call IDR function for each condition (A, B) using the replicate variables and the path from column pathB
IDRfilterN3(peakA1, peakA2, "/dir/sample_1.bam", "/dir/sample_2.bam")
IDRfilterN3(peakB1, peakB2, "/dir/sample_3.bam", "/dir/sample_4.bam")
Thanks for your input.

Assign dataframe name to a variable in a function

I have created a function and passing data frame as a parameter to the function. Now, I would like to take that data frame name as a string and store it into as a string variable.
Code used:
RFun <- function(a){
args=(commandArgs(TRUE))
l<<-80
h<<-85
fname<<-paste(a,"_Temp.csv")
a_R<-filter(a_RW,cs==2|cs==3)
a_R<-a_Rinse[-c(2,3)]
write.csv(a_R,file=fname,row.names=FALSE)
a_Rinse_Temperature_Deviations <- read.csv(paste("~/",fname"))
}
RFun(df)
From the above function when I try to execute it is creating numeric variables l and h with values which I have specified, but fname is creating for the complete data frame with rows and columns and it is not storing as I require here.
It is taking lot of time for execution as well.
Expected fname should be df_Temp.csv. Where df is the data frame.
Looks like assign(String varName , obj Value) might get you where you need to be.
RFun<-function(a){
args=(commandArgs(TRUE))
l<<-80
h<<-85
fname <<- "File_Name_Text"
assign (fname,paste(a,"_Temp.csv"))
a_R<-filter(a_RW,cs==2|cs==3)
a_R<-a_Rinse[-c(2,3)]
write.csv(a_R,file=fname,row.names=FALSE)
a_Rinse_Temperature_Deviations <- read.csv(paste("~/",fname))
}
It's hard to follow without a working example. But try to assign only the "name" of your df instead of the complete df. Try this:
fname <<- paste(deparse(substitute(a)),"_Temp.csv",sep="")

Write an R Function that repeats data manipulating routines

Here is what I would expect the function to do:
datalist <- c("var1","var2",...)
my.function <- function(datalist){
n <- length(dlist)
varnames <- paste("data", dlist, sep = ".")
for (...) { # for each var in 'varnames'
... # grab each variable from some specific online dataset;
... # do some basic data manipulation for each variable
}
... # return all the results
}
The main difficulty for me is:
(1) how to do the loop so the grabbed data could be properly temporally stored, and
(2) how the multiple variables could be returned, after finishing the loop;
EDIT:
The loop can create variables I want during the loop, say VAR1 and VAR2, which were stored in the 'dlist' argument, but I cannot manipulate VAR1 or VAR2 in the function, dlist[1] or dlist[2] in the function would only give me a string but not the variable itself.
Thanks in advance.
I think I have solved the problem and make the function work as I expected.
As I described in the question, the main problem in fact is how to manipulate the variables while VAR1 and VAR2 themselves are strings in the function.
eval combined with as.name should work:
eval(as.name(dlist[i]))

Is there a more efficient/clean approach to an eval(parse(paste0( set up?

Sometimes I have code which references a specific dataset based on some variable ID. I have then been creating lines of code using paste0, and then eval(parse(...)) that line to execute the code. This seems to be getting sloppy as the length of the code increases. Are there any cleaner ways to have dynamic data reference?
Example:
dataset <- "dataRef"
execute <- paste0("data.frame(", dataset, "$column1, ", dataset, "$column2)")
eval(parse(execute))
But now imagine a scenario where dataRef would be called for 1000 lines of code, and sometimes needs to be changed to dataRef2 or dataRefX.
Combining the comments of Jack Maney and G.Grothendieck:
It is better to store your data frames that you want to access by a variable in a list. The list can be created from a vector of names using get:
mynames <- c('dataRef','dataRef2','dataRefX')
# or mynames <- paste0( 'dataRef', 1:10 )
mydfs <- lapply( mynames, get )
Then your example becomes:
dataset <- 'dataRef'
mydfs[[dataset]][,c('column1','column2')]
Or you can process them all at once using lapply, sapply, or a loop:
mydfs2 <- lapply( mydfs, function(x) x[,c('column1','column2')] )
#G.Grothendieck has shown you how to use get and [ to elevate a character value and return the value of a named object and then reference named elements within that object. I don't know what your code was intended to accomplish since the result of executing htat code would be to deliver values to the console, but they would not have been assigned to a name and would have been garbage collected. If you wanted to use three character values: objname, colname1 and colname2 and those columns equal to an object named after a fourth character value.
newname <- "newdf"
assign( newname, get(dataset)[ c(colname1, colname2) ]
The lesson to learn is assign and get are capable of taking character character values and and accessing or creating named objects which can be either data objects or functions. Carl_Witthoft mentions do.call which can construct function calls from character values.
do.call("data.frame", setNames(list( dfrm$x, dfrm$y), c('x2','y2') )
do.call("mean", dfrm[1])
# second argument must be a list of arguments to `mean`

Print dataframe name in function output

I have a function that looks like this:
removeRows <- function(dataframe, rows.remove){
dataframe <- dataframe[-rows.remove,]
print(paste("The", paste0(rows.remove, "th"), "row was removed from", "xxxxxxx"))
}
I can use the function like this to remove the 5th row from the dataframe:
removeRows(mtcars, 5)
The function output this message:
"The 5th row was removed from xxxxxxx"
How can I replace xxxxxxx with the name of the dataframe I have used, so in this case mtcars?
You need to access the variable name in an unevaluated context. We can use substitute for this:
removeRows <- function(dataframe, rows.remove) {
df.name <- deparse(substitute(dataframe))
dataframe <- dataframe[rows.remove,]
print(paste("The", paste0(rows.remove, "th"), "row was removed from", df.name))
}
In fact, that is its main use; as per the documentation,
The typical use of substitute is to create informative labels for data sets and plots.
I would like to point out that df.name <- deparse(substitute(dataframe)) should be used at the top of your function before any transformation is done. I used it right at the end of my function, just before doing ggsave, which does not return the name but somehow what is inside the dataframe, which is not what you want. This gave me a lot of headache.
So something like this :
function(df){
df.name <- deparse(substitute(dataframe))
ggplot()
ggsave()
}

Resources