I'm using the 'caret' library to to do some cross validation on some trees.
The library provides a function called train, that takes in a named argument "method". Via its ellipsis it's supposed to let other arguments fall through to another function that it calls. This other function (rpart) takes an argument of the same name, "method".
Therefore I want to pass two arguments with the same name... and it's clearly failing. I tried to work around things as shown below but I get the error:
"Error in train.default(x = myx, y = myy, method = "rpart2", preProcess = NULL, :
formal argument "method" matched by multiple actual arguments"
any help is much appreciated! thanks!
train.wrapper = function(myx, myy, mytrControl, mytuneLenght, ...){
result = train(
x=myx,
y=myy,
method="rpart2",
preProcess=NULL,
...,
weights=NULL,
metric="Accuracy",
trControl=mytrControl,
tuneLength=mytuneLenght
)
return (result)
}
dtree.train.cv = train.wrapper(training.matrix[,2:1777],
training.matrix[,1],
2, method="class")
Here's a mock-up of your problem with a tr (train) function that calls an rp (rpart) function, passing it ...:
rp <- function(method, ...) method
tr <- function(method, ...) rp(...)
# we want to pass 2 to rp:
tr(method=1, method=2) # Error
tr(1, method=2) # 1, (wrong value!)
tr(method=1, metho=2) # 2 (Yay!)
What magic is this? And why does the last case actually work?! Well, we need to understand how argument matching works in R. A function f <- function(foo, bar) is said to have formal parameters "foo" and "bar", and the call f(foo=3, ba=13) is said to have (actual) arguments "foo" and "ba".
R first matches all arguments that have exactly the same name as a formal parameter. This is why the first "method" argument gets passed to train. Two identical argument names cause an error.
Then, R matches any argument names that partially matches a (yet unmatched) formal parameter. But if two argument names partially match the same formal parameter, that also causes an error. Also, it only matches formal parameters before .... So formal parameters after ... must be specified using their full names.
Then the unnamed arguments are matched in positional order to the remaining formal arguments.
Finally, if the formal arguments include ..., the remaining arguments are put into the ....
PHEW! So in this case, the call to tr fully matches method, and then pass the rest into .... When tr then calls rp, the metho argument partially matches its formal parameter method, and all is well!
...Still, I'd try to contact the author of train and point out this problem so he can fix it properly! Since "rpart" and "rpart2" are supposed to be supported, he must have missed this use case!
I think he should rename his method parameter to method. or similar (anything longer than "method"). This will still be backward compatible, but allows another method parameter to be passed correctly to rpart.
Generally wrappers will pass their parameters in a named list. In the case of train, provision for control is passed in the trControl argument. Perhaps you should try:
dtree.train.cv = train.wrapper(training.matrix[,2:1777],
training.matrix[,1],
2, # will be positionally matched, probably to 'myTuneLenght'
myTrControl=list(method="class") )
After your comment I reviewed again the train and rpart help pages. You could well be correct in thinking that trControl has a different purpose. I am suspicious that you may need to construct your call with a formula since rpart only has a formula method. If the y argument is a factor than method="class will be assumed by rpart. And ... running modelLookup:
modelLookup("rpart2")
model parameter label seq forReg forClass probModel
154 rpart2 maxdepth Max Tree Depth TRUE TRUE TRUE TRUE
... suggest to me that a "class" method would be assumed by default as well. You may also need to edit your question to include a data example (perhaps from the rpart help page?) if you want further advice.
Related
I have this data frame:
df <- data.frame(ref_ws = ref_ws,
turb_ws = turb_ws,
ref_wd = ref_wd,
fcf = turb_ws/ref_ws,
ref_fi = ref_fi,
shear = shear,
turbulence_intensity = turbulence_intensity,
inflow = inflow,
veer = veer)
that is part of a function where I define optional arguments (shear, turbulence_intensity, inflow and veer )
trial_plots <- function(ref_ws,turb_ws,ref_wd,shear,turbulence_intensity,inflow,veer)
the variables ref_ws,turb_ws,ref_wd are mandatory but the others are optional.
The optional ones will generate an individual plot for each one in case that we define the argument in the function.
For example, if shear is not used, I want to continue and see if it can generate the next plot regarding the turbulence_intensity and so on.
At the moment this is is the error:
Error in data.frame(ref_ws = ref_ws, turb_ws = turb_ws,ref_wd = ref_wd, :
argument "veer" is missing, with no default
How can I define these arguments to be optional?
Hadley recommends to use NULL value as default argument and use is.null test in the function body:
Sometimes you want to add a non-trivial default value, which might take several lines of code to compute. Instead of inserting that code in the function definition, you could use missing() to conditionally compute it if needed. However, this makes it hard to know which arguments are required and which are optional without carefully reading the documentation. Instead, I usually set the default value to NULL and use is.null() to check if the argument was supplied.
From Advanced R book
I think it's a useful advice and personally use it a lot.
I am trying to figure out if it is possible, with a sane amount of programming, to create a certain debugging function by using R's metaprogramming features.
Suppose I have a block of code, such that each line uses as all or part of its input the output from thee line before -- the sort of code you might build with pipes (though no pipe is used here).
{
f1(args1) -> out1
f2(out1, args2) -> out2
f3(out2, args3) -> out3
...
fn(out<n-1>, args<n>) -> out<n>
}
Where for example it might be that:
f1 <- function(first_arg, second_arg, ...){my_body_code},
and you call f1 in the block as:
f1(second_arg = 1:5, list(a1 ="A", a2 =1), abc = letters[1:3], fav = foo_foo)
where foo_foo is an object defined in the calling environment of f1.
I would like a function I could wrap around my block that would, for each line of code, create an entry in a list. Each entry would be named (line1, line2) and each line entry would have a sub-entry for each argument and for the function output. the argument entries would consist, first, of the name of the formal, to which the actual argument is matched, second, the expression or name supplied to that argument if there is one (and a placeholder if the argument is just a constant), and third, the value of that expression as if it were immediately forced on entry into the function. (I'd rather have the value as of the moment the promise is first kept, but that seems to me like a much harder problem, and the two values will most often be the same).
All the arguments assigned to the ... (if any) would go in a dots = list() sublist, with entries named if they have names and appropriately labeled (..1, ..2, etc.) if they are assigned positionally. The last element of each line sublist would be the name of the output and its value.
The point of this is to create a fairly complete record of the operation of the block of code. I think of this as analogous to an elaborated version of purrr::safely that is not confined to iteration and keeps a more detailed record of each step, and indeed if a function exits with an error you would want the error message in the list entry as well as as much of the matched arguments as could be had before the error was produced.
It seems to me like this would be very useful in debugging linear code like this. This lets you do things that are difficult using just the RStudio debugger. For instance, it lets you trace code backwards. I may not know that the value in out2 is incorrect until after I have seen some later output. Single-stepping does not keep intermediate values unless you insert a bunch of extra code to do so. In addition, this keeps the information you need to track down matching errors that occur before promises are even created. By the time you see output that results from such errors via single-stepping, the matching information has likely evaporated.
I have actually written code that takes a piped function and eliminates the pipes to put it in this format, just using text manipulation. (Indeed, it was John Mount's "Bizarro pipe" that got me thinking of this). And if I, or we, or you, can figure out how to do this, I would hope to make a serious run on a second version where each function calls the next, supplying it with arguments internally rather than externally -- like a traceback where you get the passed argument values as well as the function name and and formals. Other languages have debugging environments like that (e.g. GDB), and I've been wishing for one for R for at least five years, maybe 10, and this seems like a step toward it.
Just issue the trace shown for each function that you want to trace.
f <- function(x, y) {
z <- x + y
z
}
trace(f, exit = quote(print(returnValue())))
f(1,2)
giving the following which shows the function name, the input and output. (The last 3 is from the function itself.)
Tracing f(1, 2) on exit
[1] 3
[1] 3
I have a function:
func <- function (x)
{
arguments <- match.call()
return(arguments)
}
1) If I call my function with specifying argument in the call:
func("value")
I get:
func(x = "value")
2) If I call my function by passing a variable:
my_variable <-"value"
func(my_variable)
I get:
func(x = my_variable)
Why is the first and the second result different?
Can I somehow get in the second call "func(x = "value")"?
I'm thinking my problem is that the Environment inside a function simply doesn't contain values if they were passed by variables. The Environment contains only names of variables for further lookup. Is there a way to follow such reference and get value from inside a function?
In R, when you pass my_variable as formal argument x into a function, the value of my_variable will only be retrieved when the function tries to read x (if it does not use x, my_variable will not be read at all). The same applies when you pass more complicated arguments, such as func(x = compute_my_variable()) -- the call to compute_my_variable will take place when func tries to read x (this is referred to as lazy evaluation).
Given lazy evaluation, what you are trying to do is not well defined because of side effects - in which order would you like to evaluate the arguments? Which arguments would you like to evaluate at all? (note a function can just take an expression for its argument using substitute, but not evaluate it). As a side effect, compute_my_variable could modify something that would impact the result of another argument of func. This can happen even when you only passed variables and constants as arguments (function func could modify some of the variables that will be later read, or even reading a variable such as my_variable could trigger code that would modify some of the variables that will be read later, e.g. with active bindings or delayed assignment).
So, if all you want to do is to log how a function was called, you can use sys.call (or match.call but that indeed expands argument names, etc). If you wanted a more complete stacktrace, you can use e.g. traceback(1).
If for some reason you really wanted values of all arguments, say as if they were all read in the order of match.call, which is the order in which they are declared, you can do it using eval (returns them as list):
lapply(as.list(match.call())[-1], eval)
can't you simply
return paste('func(x =', x, ')')
consumeSingleRequest <- function(api_key, URL, columnNames, globalParam="", ...)
consumeSingleRequest <- function(api_key, URL, columnNames, valuesList, globalParam="")
I am trying to overload a function like this, that takes in multiple lists in the first function and combines them into one list of lists. However, I don't seem to be able to skip passing in globalParam and pass in oly the multiple lists in the ...
Does anyone know how to do that?
I've heard S3 methods could be used for that? Does anyone know how?
R doesn't support a concept of overloading functions. It supports function calls with variable number of arguments. So you can declare a function with any number of arguments, but supply only a subset of those when actually calling a function. Take vector function as an example:
> vector
function (mode = "logical", length = 0L)
.Internal(vector(mode, length))
<bytecode: 0x103b89070>
<environment: namespace:base>
It supports up to 2 parameters, but can be called with none or some subset(in that case default values are used) :
> vector()
logical(0)
> vector(mode='numeric')
numeric(0)
So you only need a second declaration:
consumeSingleRequest <- function(api_key, URL, columnNames, valuesList, globalParam="")
And supply just supply the needed parameters when actually calling the function
consumeSingleRequest(api_key=..., valueList=...)
P.S. A good explanation can be found in Advanced R Book.
I now have the class construction working in two ways:
The first,
setMethod("initialize", signature(.Object = "BondCashFlows"),
function(.Object, x, y, ...){
do some things .Object#foo = array[,m]
}
The second,
BondCashFlows <- function(){do some things new("BondCashFlows", ...)
So, my question is why do I even have to bother with the first since the second is much more of a user friendly way of creating the object BondCashFlows?
I understand that the first is method on a class but I am not sure why I have to do this
One of the advantage of using S4 method over a simple R function , is that the method is strongly typed.
Having a signature is a guard that methods aren't exposed to types
that doesn't meet their signature requirements. Otherwise it will
throw an exception.
It's often the case that you want to differentiate method behavior
depending on the parameter type passed. Strong typing makes that
very easy and simple.
Strongly typed is more human readable ( even if in R this argument can be debated, The S4 syntax is not very intuitive specially for a beginner)
Here and example, where I define a simple function then I wrap it in a method
show.vector <- function(.object,name,...).object[,name]
## you should first define a generic to define
setGeneric("returnVector", function(.object,name,...)
standardGeneric("returnVector")
)
## the method here is just calling the showvector function.
## Note that the function argument types are explicitly defined.
setMethod("returnVector", signature(.object="data.frame", name="character"),
def = function(.object, name, ...) show.vector(.object,name,...),
valueClass = "data.frame"
)
Now if you test this :
show.vector(mtcars,'cyl') ## works
show.vector(mtcars,1:10) ## DANGER!!works but not the desired behavior
show.vector(mtcars,-1) ## DANGER!!works but not the desired behavior
comparing to the method call:
returnVector(mtcars,'cyl') ## works
returnVector(mtcars,1:10) ## SAFER throw an excpetion
returnVector(mtcars,-1) ## SAFER throw an excpetion
Hence, If you will expose your method to others, it is better to encapsulate them in a method.