Creating a custom Stat object in ggplot2 - r

I'd like to create a custom Stat object for ggplot2. (Specifically I'd like to create a smoother that works differently than the ones stat_smooth allows- for instance, without a y~x modeling function- but there are other custom Stats I'd like to create even if there were a workaround for my specific case).
I found this suggested solution from Hadley Wickham:
StatExpo <- proto(Stat, {
objname <- "expo"
desc <- "Exponential smoothing"
default_geom <- function(.) GeomLine
calculate_groups <- function(., data, scales, variable="x", ...) {
data$y <- HoltWinters(data$x, ...)
}
})
stat_expo <- StatExpo$new
However, when I try it I get:
Error in proto(Stat, { : object 'Stat' not found
Upon looking around the ggplot code, I found where Stat is defined. However, the Stat object is, as far as I can tell, never exported from ggplot2.
I could write my new stat object within the ggplot2/R folder and then reinstall the package, but obviously this would be cumbersome and make the solution very difficult to share with others. How can I create a custom Stat object outside of the ggplot namespace?

ggplot2:::Stat can be used to access the non-exported object.

getFromNamespace('Stat','ggplot2')

Related

ggpubr can't find 'mean_se' unless ggpubr is attached via library()

Summary of problem: When I try to add summary stats to a ggpubr plot via the "add" parameter, ggpubr can't find the summary stat functions (example code below). For instance, if I am trying to add error bars with add="mean_se" I get an error message and no error bars.
Unsatisfactory solution: Attaching ggpubr by calling library(ggpubr) would fix this problem. See this answer.
Why the above solution is unsatisfactory: I am developing a package, and so would like to avoid attaching other packages via calls to library() - my understanding is that this is best practice, to avoid polluting the namespace with things the user might not have anticipated would get loaded.
MY QUESTION: Is there some way to get ggpubr to find mean_se without attaching the package?
Example code (in an .R file in my package):
make.plot = function(){
utils::data("iris")
ggpubr::ggbarplot(
data = iris,
x = "Species",
y = "Sepal.Length",
add = "mean_se")
}
Example output:
> devtools::load_all(".")
# i Loading MyPackage
> make.plot()
# Warning message:
# Computation failed in `stat_summary()`:
# object 'mean_se_' of mode 'function' was not found
One thing that should work, but doesn't, is passing "ggpubr::mean_se_" as the add argument. This avoids the error message, but produces an incorrect plot. The plot should look like this:
But passing "ggpubr::mean_se_" instead produces:
Additional weirdness: If I ever add a call to load ggpubr, build MyPackage with devtools::load_all("."), and run it, then the above code never fails until I quit and reload RStudio, even if I delete the call library(ggpubr) from my package and build it again.
Since you're creating a package you can ensure that mean_se_ is in the search path by importing it in your function.
If you use roxygen you can add the tag #importFrom ggpubr mean_se_:
#' #importFrom ggpubr mean_se_
make.plot = function(){
utils::data("iris")
ggpubr::ggbarplot(
data = iris,
x = "Species",
y = "Sepal.Length",
add = "mean_se")
}
Then you run devtools::document() and run your function, and your output should look like this:
If you look at the way the add parameter is handled inside ggpubr, it is actually matched as a string to one of the summary functions. It seems that the summary functions need to be on the search path when ggbarplot is called.
The easiest way round this is to copy the function over to your own package namespace:
mean_se_ <- ggpubr::mean_se_
make.plot = function(){
utils::data("iris")
ggpubr::ggbarplot(
data = iris,
x = "Species",
y = "Sepal.Length",
add = "mean_se_")
}
make.plot()
Created on 2022-05-04 by the reprex package (v2.0.1)

How to change R plot default options

I'd like to change default plot option from type = "p" to type = "l" ; I mean I want it is like that at the beginning of each new session, without specifying it any more.
I've tried to put some piece of code in my Rprofile.site but unfortunately not the right one: firstly I wanted to use setDefaults but this package is deprecated; I also tried to set a hook but couldn't make it work.
Any ideas ?
thanks !
This could be done by adding to your Rprofile
formals(plot.default)$type <- "l"
But that would be highly discouraged, for the reasons Roland states in his comment. A better solution would be to place this in your Rprofile:
lplot <- function(x, y, type = "l", ...){
plot(x, y, type = type, ...)
}
This gives you the default you want, the ability to revert back to normal if wanted, and doesn't affect the existing plot function.
But this still comes with the downside of the lplot function seemingly appearing out of nowhere. Far better would be to put lplot in a package. Even if you load the package in the Rprofile, at least ?lplot will pull up something to indicate where it came from.

Store output from gridExtra::grid.arrange into an object

I am placing multiple plots into one image using gridExtra::grid.arrange and would like to have the option of saving the combined plot as an object that could be returned from within a function as part of a list of returned objects. Ideally, I would like to do this without printing the plot object.
The code below creates two plots, combines them with grid.arrange, and attempts to save the result into x. However, x evaluates to NULL and the plot is printed. The documentation for grid.arrange points me to arrangeGrob and suggests plotting can be turned off using plot=FALSE, but I get an error when I try that because FALSE is not a grob object.
Any suggestions for what I'm not understanding?
# R under development
# Windows 7 (32 bit)
# ggplot2 1.0.0
# gridExtra 0.9.1
p1 <- ggplot(mtcars, aes(x=factor(cyl), y=mpg)) + geom_boxplot()
p2 <- ggplot(mtcars, aes(x=factor(cyl), y=wt)) + geom_boxplot()
x <- gridExtra::grid.arrange(p1, p2)
x
Per the comments, I'm adding this edit. When I try it with arrangeGrob, I get no output at all.
> gridExtra::arrangeGrob(p1, p2)
> print(gridExtra::arrangeGrob(p1, p2))
Error: No layers in plot
> x <- gridExtra::arrangeGrob(p1, p2)
> x
Error: No layers in plot
The code in your edit does not work properly since you didn't load gridExtra.
library(gridExtra)
y <- arrangeGrob(p1, p2, ncol = 1)
class(y)
#[1] "gtable" "grob" "gDesc"
grid.draw(y)
Edit: since version 2.0.0, my comment about grid dependency below is no longer valid, since grid is now imported.
Edit: With gridExtra version >= 2.0.0, there is no need to attach either package,
p <- ggplot2::qplot(1,1)
x <- gridExtra::arrangeGrob(p, p)
grid::grid.draw(x)
Funny that this was asked so recently - I was running into this problem as well this week and was able to solve it in a bit of a hacky way, but I couldn't find any other solution I was happier with.
Problem 1: ggplotGrob is not found
I had to make sure ggplot2 is loaded. I don't completely understand what's happening (I admit I don't fully understand imports/depends/attaching/etc), but the following fixes that. I'd be open to feedback if this is very dangerous.
if (!"package:ggplot2" %in% search()) {
suppressPackageStartupMessages(attachNamespace("ggplot2"))
on.exit(detach("package:ggplot2"))
}
Somebody else linked to this blog post and I think that works as well, but from my (non-complete) understanding, this solution is less horrible. I think.
Problem 2: no layers in plot
As you discovered too, fixing that problem allows us to use grid.arrange, but that returns NULL and doesn't allow saving to an object. So I also wanted to use arrangeGrob but I also ran into the above error when gridExtra was not already loaded. Applying the fix from problem 1 again doesn't seem to work (maybe the package is getting de-attached too early?). BUT I noticed that calling grid::grid.draw on the result of arrangeGrob prints it fine without error. So I added a custom class to the output of arrangeGrob and added a generic print method that simply calls grid.draw
f <- function() {
plot <- gridExtra::arrangeGrob(...)
class(plot) <- c("ggExtraPlot", class(plot))
plot
}
print.ggExtraPlot <- function(x, ...) {
grid::grid.draw(x)
}
Hooray, now I can open a fresh R session with no packages explicitly loaded, and I can successfully call a function that creates a grob and print it later!
You can see the code in action in my package on GitHub.

Why can't I pass a dataset to a function?

I'm using the package glmulti to fit models to several datasets. Everything works if I fit one dataset at a time.
So for example:
output <- glmulti(y~x1+x2,data=dat,fitfunction=lm)
works just fine.
However, if I create a wrapper function like so:
analyze <- function(dat)
{
out<- glmulti(y~x1+x2,data=dat,fitfunction=lm)
return (out)
}
simply doesn't work. The error I get is
error in evaluating the argument 'data' in selecting a method for function 'glmulti'
Unless there is a data frame named dat, it doesn't work. If I use results=lapply(list_of_datasets, analyze), it doesn't work.
So what gives? Without my said wrapper, I can't lapply a list of datasets through this function. If anyone has thoughts or ideas on why this is happening or how I can get around it, that would be great.
example 2:
dat=list_of_data[[1]]
analyze(dat)
works fine. So in a sense it is ignoring the argument and just literally looking for a data frame named dat. It behaves the same no matter what I call it.
I guess this is -yet another- problem due to the definition of environments in the parse tree of S4 methods (one of the resons why I am not a big fan of S4...)
It can be shown by adding quotes around the dat :
> analyze <- function(dat)
+ {
+ out<- glmulti(y~x1+x2,data="dat",fitfunction=lm)
+ return (out)
+ }
> analyze(test)
Initialization...
Error in eval(predvars, data, env) : invalid 'envir' argument
You should in the first place send this information to the maintainers of the package, as they know how they deal with the environments internally. They'll have to adapt the functions.
A -very dirty- workaround for yourself, is to put "dat" in the global environment and delete it afterwards.
analyze <- function(dat)
{
assign("dat",dat,envir=.GlobalEnv) # put the dat in the global env
out<- glmulti(y~x1+x2,data=dat,fitfunction=lm)
remove(dat,envir=.GlobalEnv) # delete dat again from global env
return (out)
}
EDIT:
Just for clarity, this is really about the worst solution possible, but I couldn't manage to find anything better. If somebody else gives you a solution where you don't have to touch your global environment, by all means use that one.

modify lm or loess function to use it within ggplot2's geom_smooth

I need to modify the lm (or eventually loess) function so I can use it in ggplot2's geom_smooth (or stat_smooth).
For example, this is how stat_smooth is used normally:
> qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm')`
I would like to define a custom lm2 function to use as value for the method parameter in stat_smooth, so I can customize its behaviour.
> lm2 <- function(formula, data, ...)
{
print(head(data))
return(lm(formula, data, ...))
}
> qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm2')
Note that I have used method='lm2' as parameter in stat_smooth.
When I execute this code a get the error:
Error in eval(expr, envir, enclos) : 'nthcdr' needs a list to CDR down
Which I don't understand very well. The lm2 method works very well when run outside of stat_smooth. I played with this a bit and I have got different types of error, but since I am not comfortable with R's debug tools it is difficult for me to debug them. Honestly, I don't get what I should put inside the return() call.
There is some weirdness in using ... as an argument in a function call that I don't fully understand (it has something to do with ... being a list-type object).
Here is a version that works by taking the function call as an object, setting the function to be called to lm and then evaluating the call in the context of our own caller. The result of this evaluation is our return value (in R the value of the last expression in a function is the value returned, so we do not need an explicit return).
foo <- function(formula,data,...){
print(head(data))
x<-match.call()
x[[1]]<-quote(lm)
eval.parent(x)
}
If you want to add arguments to the lm call, you can do it like this:
x$na.action <- 'na.exclude'
If you want to drop arguments to foo before you call lm, you can do it like this
x$useless <- NULL
By the way, geom_smooth and stat_smooth pass any extra arguments to the smoothing function, so you need not create a function of your own if you only need to set some extra arguments
qplot(data=diamonds, carat, price, facets=~clarity) +
stat_smooth(method="loess",span=0.5)

Resources