I would like to use a single object to pass multiple inputs to a function in R, is this possible? MWE:
df <- data.frame(yes = c(10,20), no = c(50,60),maybe = c(100,200))
fxn <- function(x,y,z){
a = x + y
b = x + z
c = y + z
return(list(a=a,b=b,c=c))
}
foo <- c("rincon","malibu","steamer")
bar <- c("no","maybe")
df[foo] <- fxn(df$yes,df[bar])
In the actual problem, my function has more inputs that are in the default set to NULL. I am working in a dynamic shiny context, so the value and length of bar is changing. Any help for this newbie would be greatly appreciated.
With base R you can build the call using do.call and create a list() of parameters you want to pass to the function
do.call("fxn", c(list(df$yes), unname(df[bar])))
This would be the same as
fxn(df$yes, df[bar][[1]], df[bar][[2]])
We need to use the unname() because otherwise your parameters would be named "no" and "maybe" while your function is expecting "y" and "z".
The the rlang package, you could do
library(rlang)
eval_tidy(quo(fxn(df$yes, !!!unname(df[bar]))))
That uses the !!! splicing operator like some other languages have. Base R does not have such a syntax.
Related
I'm sure the question is a bit dummy (sorry)... I'm trying to create a function using differents variables I have stored in a Dataframe. The function is like that:
mlr_turb <- function(Cond_in, Flow_in, pH_in, pH_out, Turb_in, nm250_i, nm400_i, nm250_o, nm400_o){
Coag = (+0.032690 + 0.090289*Cond_in + 0.003229*Flow_in - 0.021980*pH_in - 0.037486*pH_out
+0.016031*Turb_in -0.026006*nm250_i +0.093138*nm400_o - 0.397858*nm250_o - 0.109392*nm400_o)/0.167304
return(Coag)
}
m4_turb <- mlr_turb(dataset)
The problem is when I try to run my function in a dataframe (with the same name of variables). It doesn't detect my variables and shows this message:
Error in mlr_turb(dataset) :
argument "Flow_in" is missing, with no default
But, actually, there is, also all the variables.
I think I missplace or missing some order in the function that gives it the possibility to take the variables from the dataset. I have searched a lot about that but I have not found any answer...
No dumb questions!
I think you're looking for do.call. This function allows you to unpack values into a function as arguments. Here's a really simple example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15))
# unpack the values into the function using do.call
do.call('myFun', myData)
Output:
[1] 0.3765084 0.6902654 0.9557522 1.1833122 1.3805309
You meet a standard problem when writing R that is related to the question of standard evaluation (SE) vs non standard evaluation (NSE). If you need more elements, you can have a look at this blog post I wrote
I think the most convenient way to write function using variables is to use variable names as arguments of the function.
Let's take again #Muon example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
The question is where R should find the values behind names x, y and z. In a function, R will first look within the function environment (here x,y and z are defined as parameters) then it will look at global environment and then it will look at the different packages attached.
In myFun, R expects vectors. If you give a column name, you will experience an error. What happens if you want to give a column name ? You must say to R that the name you gave should be associated to a value in the scope of a dataframe. You can for instance do something like that:
myFun <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- (df[,col1] + df[,col2])/df[,col3]
return(result)
}
You can go far further in that aspect with data.table package. If you start writing functions that need to use variables from a dataframe, I recommend you to start having a look at this package
I like Muon's answer, but I couldn't get it to work if there are columns in the data.frame not in the function. Using the with() function is a simple way to make this work as well...
#Code from Muon:
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15),
a=6:10) #adding a var not used in myFun
# unpack the values into the function using do.call
do.call('myFun', myData)
#generates an error for the unused "a" column
#using with() function:
with(myData, myFun(x, y, z))
In R, we can reference items created within that same list, i.e.:
list(a = a <- 1, b = a)
I am curious if there is a way to write a function which takes the place of a = a <- 1. That is, if something like
`%=%` <- function(x,y) {
envir <- environment()
char_x <- deparse(substitute(x))
assign(char_x, y, parent.env(envir))
unlist(lapply(setNames(seq_along(x),char_x), function(T) y))
}
# does not work
list(a%=%1, b=a)
is possible in R (i.e. returns the list given above)?
edit: I think this boils down to asking, 'can we call list with a language object that preserves all aspects of manually coding list?' (specifically, assigns the list's names attribute the left-hand side of the language element).
It seems to me that below shows that such a solution is hopeless.
my_call <- do.call(substitute, list(expr(expr = {x = y}), list(x=quote(a), y=1)))
equals <- languageEl(my_call, which = 1)
str(equals)
do.call(list, list(equals))
Welp, the clever folk behind tibble have figured this out in their lst() function (also in package dplyr)
library(dplyr)
lst(a=1, b=a, c=c(3,4), d=c)
What a useful feature!
I currently have a basic script written in R, which has two functions embedded within another:
FunctionA <- Function() {
results_from_B <- FunctionB()
results_from_C <- FunctionC()
}
Function B generates some data which is then analysed in Function C.
If I stop the code within function A, I can see the structure of results_from_C - this appears under 'values' and I can refer to different elements using the syntax results_from_C$column_name1.
I achieved this within Function C by specifying the returned values using:
return(list(column_name_1 = value1, column_name_2 = value2)
However, I cannot work out how I can return these same values (in the same structure) from Function A - everything I try returns a list which is formatted as 'Data' rather than 'Values' and cannot be indexed using the syntax results_from_A$column_name1.
Can anyone help me to understand what I need to do in order to extract results from Function C outside of Function A?
Thanks in advance
I don't understand what you mean by formatted as 'Data' rather than 'Values'.
There's nothing wrong with the setup you describe, I every now and then use functions inside functions, it's perfectly OK.
(Note that R is case sensitive, it's function not Function.)
FunctionA <- function() {
FunctionB <- function() 1:2*pi
FunctionC <- function(x)
list(column_name_1 = x[1], column_name_2 = x[2])
results_from_B <- FunctionB()
results_from_C <- FunctionC(results_from_B)
results_from_C
}
result <- FunctionA()
result
$column_name_1
[1] 3.141593
$column_name_2
[1] 6.283185
result$column_name_1
[1] 3.141593
Is this it? If not, please clarify your question.
my question is how can I get the name of a dataframe not the colnames
for example d is my dataframe I want to use a function to get the exact name "d" rather than the results from names(d)
Thank you so much!
Update:
The reason why I am asking this is because I want to write a function to generate several plots at one time. I need to change the main of the plots in order to distinguish them. My function looks like
fct=function(data){
cor_Max = cor(data)
solution=fa(r = cor_Max, nfactors = 1, fm = "ml")
return(fa.diagram(solution,main=names(data))
}
How can I change the main in the function correspondingly to the data's name?
You can use the fact that R allows you to obtain the text representation of an expression:
getName <- function(x) deparse(substitute(x))
print(getName(d))
# [1] "d"
objects() will list all of the objects in your environment. Note that names(), as used in your question, provides the column names of the data frame.
I read your question to say that you are looking for the name of the data frame, not the column names. So you're looking for the name passed to the data argument of fct. If so, perhaps something like the following would help
fct <- function(data){
cor_Max <- cor(data)
# as.character(sys.call()) returns the function name followed by the argument values
# so the value of the "data" argument is the second element in the char vector
main <- as.character(sys.call())[2]
print(main)
}
This is a bit ad hoc but maybe it would work for your case.
The most accepted way to do this is as Robert showed, with deparse(substitute(x)).
But you could try something with match.call()
f <- function(x){
m <- match.call()
list(x, as.character(m))
}
> y <- 25
> f(y)
# [[1]]
# [1] 25
#
# [[2]]
# [1] "f" "y"
Now you've got both the value of y and its name, "y" inside the function environment. You can use as.character(m)[-1] to retrieve the object name passed to the argument x
So, your function can use this as a name, for example, like this:
fct <- function(data){
m <- match.call()
plot(cyl ~ mpg, data, main = as.character(m)[-1])
}
> fct(mtcars)
I want to create an S4 class in R that will allow me to access large datasets (in chunks) from the cloud (similar to the goals of the ff package). Right now I'm working with a toy example called "range.vec" (I don't want to deal with internet access yet), which stores a sequence of numbers like so:
setClass("range.vec",
representation(start = "numeric", #beginning num in sequence
end = "numeric", #last num in sequence
step = "numeric", #step size
chunk = "numeric", #cache a chunk here to save memory
chunkpos = "numeric"), #where does the chunk start in the overall vec
contains="numeric" #inherits methods from numeric
)
I want this class to inherit the methods from "numeric", but I want it to use these methods on the whole vector, not just the chunk that I'm storing. For example, I don't want to define my own method for 'mean', but I want 'mean' to get the mean of the whole vector by accessing it chunk by chunk, using length(), '[', '[[', and el() functions that I've defined. I've also defined a chunking function:
setGeneric("set.chunk", function(x,...) standardGeneric("set.chunk"))
setMethod("set.chunk", signature(x = "range.vec"),
function (x, chunksize=100, chunkpos=1) {
#This function extracts a chunk of data from the range.vec object.
begin <- x#start + (chunkpos - 1)*x#step
end <- x#start + (chunkpos + chunksize - 2)*x#step
data <- seq(begin, end, x#step) #calculate values in data chunk
#get rid of out-of-bounds values
data[data > x#end] <- NA
x#chunk <- data
x#chunkpos <- chunkpos
return(x)
}})
When I try to call a method like 'mean', the function inherits correctly, and accesses my length function, but returns NA because I don't have any data stored in the .Data slot. Is there a way that I can use the .Data slot to point to my chunking function, or to tell the class to chunk numeric methods without defining every single method myself? I'm trying to avoid coding in C if I can. Any advice would be very helpful!
You could remove your chunk slot and replace it by numeric's .Data slot.
Little example:
## class definition
setClass("foo", representation(bar="numeric"), contains="numeric")
setGeneric("set.chunk", function(x, y, z) standardGeneric("set.chunk"))
setMethod("set.chunk",
signature(x="foo", y="numeric", z="numeric"),
function(x, y, z) {
## instead of x#chunk you could use numeric's .Data slot
x#.Data <- y
x#bar <- z
return(x)
})
a <- new("foo")
a <- set.chunk(a, 1:10, 4)
mean(a) # 5.5
Looks like there isn't a good way to do this within the class. The only solution I've found is to tell the user to calculate to loop through all of the chunks of data from the cloud, and calculate as they go.