R> ecdf
function (x)
{
x <- sort(x)
n <- length(x)
if (n < 1)
stop("'x' must have 1 or more non-missing values")
vals <- unique(x)
rval <- approxfun(vals, cumsum(tabulate(match(x, vals)))/n,
method = "constant", yleft = 0, yright = 1, f = 0, ties = "ordered")
class(rval) <- c("ecdf", "stepfun", class(rval))
assign("nobs", n, envir = environment(rval))
attr(rval, "call") <- sys.call()
rval
}
The above is the code of ecdf(). I see that the return value is assigned with class stepfun. I don't understand what it is for. Given approxfun() does linear interpolation, why is stepfun needed? What is the purpose of adding stepfun to the class?
In the call to approxfun() in ecdf(), the method is "constant", which means it isn't doing linear interpolation, it's generating a step function.
To find out what the class on the result affects, you can use the methods() function:
methods(class="stepfun")
#> [1] knots lines plot print summary
#> see '?methods' for accessing help and source code
Created on 2022-02-12 by the reprex package (v2.0.1.9000)
So potentially if you call any of the knots(), lines(), plot(), print() or summary() functions on the result you'll get behaviour tailored to step functions. However, the class of the result is computed as c("ecdf", "stepfun", class(rval)), so there might be "ecdf" methods that override the "stepfun" methods:
methods(class="ecdf")
#> [1] plot print quantile summary
#> see '?methods' for accessing help and source code
Created on 2022-02-12 by the reprex package (v2.0.1.9000)
Yes, the plot(), print() and summary() functions will call the "ecdf" methods in preference to the "stepfun" methods. That still leaves knots() and lines(), and conceivably the others could call NextMethod() to get to the "stepfun" methods.
One clarification: the method argument to approxfun() is just the name of an argument; in the discussion above, "methods" was used to refer to one of the ways R does object oriented programming, using methods and classes.
Related
I have a difficulty in learning how to use eval() to evaluate a function,
suppose i have a function:
sq <- function(y){ y**2 }
u can evaluate this function like this:
call <- match.call(expand.dots = FALSE)
call[[1]] <- as.name('sq')
call$y <- 0.2
call <- call[c(1,3)]
eval(call)
and it will give u 0.2^2 = 0.04
But if i want to calculate sth like sq(y), where y = sin(x), i may write:
call <- match.call(expand.dots = FALSE)
call[[1]] <- as.name('sq')
call$y <- as.name('sin')
call$x <- 0.2
call <- call[c(1,3:4)]
eval(call)
it will give me this error:
Error in sq(y = sin, x = 0.2) : unused argument (x = 0.2)
Seems that R cannot recognize x as an argument of sin, but an argument of sq instead. how can we tell R that x is an argument of sin?
Also, it seems that R is the only language i have learned that uses eval() to evaluate a function (i know C++ and Python, but havent seen that syntax before), what is the different (or advantage) to evaluate a function in this way instead of calling sq(y=sin(x=0.2))?
Is there a good book or tutorial talking about its usage, and when to use between the two ways? Thanks!
PS: the example above is actually a simplified version of the code in mlogit package im studying, in which the log likelihood is returned by calling 'lnl.slogit' and is passed to 'mlogit.optim' and get optimized (Line 407 of https://github.com/cran/mlogit/blob/master/R/mlogit.R). I used the same method as the code in the package to call two functions, but i got the error above.
The code is trying to pass:
an argument x to sq but sq has no x argument
the function sin in argument y but a number is required, not a function.
Try this:
x <- 0.2
cl <- call("sq", y = quote(sin(x)))
cl
## sq(y = sin(x))
eval(cl)
## [1] 0.0394695
or maybe what you want is:
x <- 0.2
cl <- call("sq", y = sin(x))
cl
## sq(y = 0.198669330795061)
eval(cl)
## [1] 0.0394695
or
match.fun("sq")(sin(x))
## [1] 0.0394695
or just:
sq(sin(x))
## [1] 0.0394695
Note that ordinarily you do not have to use eval. Just listing the function with its arguments is enough to evaluate it as the in last line of code.
The regression functions in R internally use non-standard code due to considerations related to environments but ordinarily that would not be needed in other contexts.
I have a wrapper function of two functions. Each function has its own parameters vectors. The main idea is to pass the vectors of parameters (which is a vector or two vectors) to optim and then, I would like to maximize the sum of the function.
Since my function is so complex, then I tried to provide a simple example which is similar to my original function. Here is my code:
set.seed(123)
x <- rnorm(10,2,0.5)
ff <- function(x, parOpt){
out <- -sum(log(dnorm(x, parOpt[[1]][1], parOpt[[1]][2]))+log(dnorm(x,parOpt[[2]][1],parOpt[[2]][2])))
return(out)
}
# parameters in mu,sd vectors arranged in list
params <- c(set1 = c(2, 0.2), set2 = c(0.5, 0.3))
xy <- optim(par = params, fn=ff ,x=x)
Which return this error:
Error in optim(par = params, fn = ff, x = x) :
function cannot be evaluated at initial parameters
As I understand, I got this error because optim cannot pass the parameters to each part of my function. So, how can I tell optim that the first vector is the parameter of the first part of my function and the second is for the second part.
You should change method parameter to use initial parameters.
You can read detailed instructions about optim function using ?optim command.
For example you can use "L-BFGS-B" method to use upper and lower constraints.
I am using the randomForest package (v. 4.6-7) in R 2.15.2. I cannot find the source code for the partialPlot function and am trying to figure out exactly what it does (the help file seems to be incomplete.) It is supposed to take the name of a variable x.var as an argument:
library(randomForest)
data(iris)
rf <- randomForest(Species ~., data=iris)
x1 <- "Sepal.Length"
partialPlot(x=rf, pred.data=iris, x.var=x1)
# Error in `[.data.frame`(pred.data, , xname) : undefined columns selected
partialPlot(x=rf, pred.data=iris, x.var=as.character(x1))
# works!
typeof(x1)
# [1] "character"
x1 == as.character(x1)
# TRUE
# Now if I try to wrap it in a function...
f <- function(w){
partialPlot(x=rf, pred.data=iris, x.var=as.character(w))
}
f(x1)
# Error in as.character(w) : 'w' is missing
Questions:
1) Where can I find the source code for partialPlot?
2) How is it possible to write a function which takes a string x1 as an argument where x1 == as.character(x1), but the function throws an error when as.character is not applied to x1?
3) Why does it fail when I wrap it inside a function? Is partialPlot messing with environments somehow?
Tips/ things to try that might be helpful for solving such questions by myself in future would also be very welcome!
The source code for partialPlot() is found by entering
randomForest:::partialPlot.randomForest
into the console. I found this by first running
methods(partialPlot)
because entering partialPlot only tells me that it uses a method. From the methods call we see that there is one method, and the asterisk next to it tells us that it is a non-exported function. To view the source code of a non-exported function, we use the triple-colon operator :::. So it goes
package:::generic.method
Where package is the package, generic is the generic function (here it's partialPlot), and method is the method (here it's the randomForest method).
Now, as for the other questions, the function can be written with do.call() and you can pass w without a wrapper.
f <- function(w) {
do.call("partialPlot", list(x = rf, pred.data = iris, x.var = w))
}
f(x1)
This works on my machine. It's not so much environments as it is evaluation. Many plotting functions use some non-standard evaluation, which can be handled most of the time with this do.call() construct.
But note that outside the function you can also use eval() on x1.
partialPlot(x = rf, pred.data = iris, x.var = eval(x1))
I don't really see a reason to check for the presence of as.character() inside the function. If you can leave a comment we can go from there if you need more info. I'm not familiar enough with this package yet to go any further.
How does R dispatch plot functions? The standard generic is defined as
plot <- function (x, y, ...) UseMethod("plot")
So usually, all plot methods need arguments x and y. Yet, there exists a variety of other functions with different arguments. Some functions only have argument x:
plot.data.frame <- function (x, ...)
others even have neither x nor y:
plot.formula <- function(formula, data = parent.frame(), ..., subset,
ylab = varnames[response], ask = dev.interactive())
How does this work and where is this documented?
Background
In my package papeR (see GitHub) I want to replace the function plot.data.frame, which is defined in the R package graphics with my own version. Yet, this is usually not allowed
Do not replace registered S3 methods from base/recommended packages,
something which is not allowed by the CRAN policies and will mean that
everyone gets your method even if your namespace is unloaded.
as Brian Ripley let me know last time I tried to do such a thing. A possible solution is as follows:
If you want to change the behaviour of a generic, say predict(), for
an existing class or two, you could add such as generic in your own
package with default method stats::predict, and then register modified
methods for your generic (in your own package).
For other methods I could easily implement this (e.g. toLatex), yet, with plot I am having problems. I added the following to my code:
## overwrite standard generic
plot <- function(x, y, ...)
UseMethod("plot")
## per default fall back to standard generic
plot.default <- function(x, y, ...)
graphics::plot(x, y, ...)
## now specify modified plot function for data frames
plot.data.frame <- function(x, variables = names(x), ...)
This works for data frames and plots with x and y. Yet, it does not work if I try to plot a formula etc:
Error in eval(expr, envir, enclos) :
argument "y" is missing, with no default
I also tried to use
plot.default <- function(x, y, ...)
UseMethod("graphics::plot")
but then I get
Error in UseMethod("graphics::plot") :
no applicable method for 'graphics::plot' applied to an object of class "formula"
So the follow up question is how I can fix this?
[Edit:] Using my solution below fixes the problems within the package. Yet, plot.formula is broken afterwards:
library("devtools")
install_github("hofnerb/papeR")
example(plot.formula, package="graphics") ## still works
library("papeR")
example(plot, package = "papeR") ## works
### BUT
example(plot.formula, package="graphics") ## is broken now
Thanks to #Roland I solved part of my problem.
It seems that the position of the arguments are used for method dispatch (and not only the names). Names are however partially used. So with Rolands example
> plot.myclass <- function(a, b, ...)
> print(b)
> x <- 1
> y <- 2
> class(x) <- "myclass"
we have
> plot(x, y)
[1] 2
> plot(a = x, b = y)
[1] 2
but if we use the standard argument names
> plot(x = x, y = y)
Error in print(b) (from #1) : argument "b" is missing, with no default
it doesnt't work. As one can see x is correctly used for the dispatch but b is then "missing". Also we cannot swap a and b:
> plot(b = y, a = x)
Error in plot.default(b = y, a = x) :
argument 2 matches multiple formal arguments
Yet, one could use a different order if the argument one wants to dispatch for is the first (?) element without name:
> plot(b = y, x)
[1] 2
Solution to the real problem:
I had to use
plot.default <- function(x, y, ...)
graphics::plot(x, y, ...)
The real issue was internally in my plot.data.frame method where I was using something along the lines of:
plot(x1 ~ y1)
without specifying data. Usually this works as data = parent.frame() per default. Somehow in this case this wasn't working. I now use plot(y1, x1) which works like a charm. So this was my sloppiness.
I'm using summary() on output of mle(stats4) function, its output belongs to class mle. I would like to find out how summary() estimates standard deviation of coefficient returned by mle(stats4), but I do not see summary.mle in list printed by methods(summary), why can't I find summary.mle() function ?
(I guess the proper function is summary.mlm(), but I'm not sure that and don't know why it would be mlm, instead of mle)
It's actually what summary.mle would be if it were an S3 method. S3 methods get created and then dispatched using the generic_function_name.class_of_first_argument mechanism whereas S4 methods are dispatched on the basis of their argument "signature" which allows consideration of second and later arguments. This is how to get showMethods to display the code that is called when an S4-method is called. This is an instance where only the first argument is used as the signature. You can choose any of the object signatures that appear in the abbreviated output to specify the classes-agument, and it is the includeDefs flag that prompts display of the code:
showMethods("summary",classes="mle", includeDefs=TRUE)
#---(output to console)----
Function: summary (package base)
object="mle"
function (object, ...)
{
cmat <- cbind(Estimate = object#coef, `Std. Error` = sqrt(diag(object#vcov)))
m2logL <- 2 * object#min
new("summary.mle", call = object#call, coef = cmat, m2logL = m2logL)
}
As shown in
>library(stats4)
>showMethods("summary")
Function: summary (package base)
object="ANY"
object="mle"
The summary is interpreted in the S4 way. I don't know how to check the code in R directly, so I search the source of stats4 directly for you.
In stats4/R/mle.R, there is:
setMethod("summary", "mle", function(object, ...){
cmat <- cbind(Estimate = object#coef,
`Std. Error` = sqrt(diag(object#vcov)))
m2logL <- 2*object#min
new("summary.mle", call = object#call, coef = cmat, m2logL = m2logL)
})
So it creates a S4 object summary.mle. And I guess you could trace the code by yourself now.