I am documenting my research in rmarkdown workbooks but want to also save my ggplots into a variable to reconfigure the plots for other cases, for example resize them for presentations.
First, I had a function that prints my plot and saves into a global variable:
PLOTS <- list()
`%<p%` <- function(name, ggplot){
PLOTS[[name]] <<- ggplot
print(ggplot)
return(invisible(NULL)) # better use ggplot here (if used with %>but this is easier for question
}
# Plots and saves
"testplot" %<p% qplot(x = Sepal.Length, y = Sepal.Width, data = iris)
Soon I had the problem that I want to save one version of the plot, but plot another version, for example save the plot but then plot in my workbook two versions with different x axis limits. So I introduced that the function recognizes invisible():
`%<p%`<- function(name, ggplot){
v <- withVisible(ggplot)$visible
if(v) print(ggplot)
PLOTS[[name]] <<- ggplot
return(invisible(NULL))
}
# Save full plot but print only x between 4 and 5
"testplot" %<p% invisible(qplot(x = Sepal.Length, y = Sepal.Width, data = iris))
PLOTS$testplot + coord_cartesian(xlim = c(4,5))
Works beautifully. Then I got really lazy and wanted to use the RStudio shortcut for <- instead of typing unwieldy special characters for %<p%, so I thought of making this function an implementation of the $<- generic, for a new class "plotlist". I then create a list with this new class to cause invokation of `$<-.plotlist`() when an element is assigned to this.
PLOTS2 <- structure(list(), class = "plotlist")
`$<-.plotlist` <- function(x, name, value){
v <- withVisible(value)$visible
if(v) print(value)
NextMethod()
}
But now things get strange as now the invisible() does not work anymore. For example, this renders the ggplot:
PLOTS2$test <- invisible(qplot(x = Sepal.Length, y = Sepal.Width, data = iris))
On the other hand, if I hadn't used the custom $<- implementation, it wouldn't even print by default, for example if I use my ordinary list without the special class "plotlist" to store my plot!
PLOTS$test <- qplot(x = Sepal.Length, y = Sepal.Width, data = iris)
How is this? I do not really know how invisible() works and the manuals of invisible() and withVisibility() say only the object stays invisible "for a while". What are the criteria how long is an object invisible, why is it not in my custom $<- implementation and can I make it honor invisible()?
Related
I have many ggplot objects where I wish to print some text (varies from plot to plot) in the same relative position on each plot, regardless of scale. What I have come up with to make it simple is to
define a rescale function (call it sx) to take the relative position I want and return that position on the plot's x axis.
sx <- function(pct, range=xr){
position <- range[1] + pct*(range[2]-range[1])
}
make the plot without the text (call it plt)
Use the ggplot_build function to find the x scale's range
xr <- ggplot_build(plt)$layout$panel_params[[1]]$x.range
Then add the text to the plot
plt <- plt + annotate("text", x=sx(0.95), ....)
This works well for me, though I'm sure there are other solutions folks have derived. I like the solution because I only need to add one step (step 3) to each plot. And it's a simple modification to the annotate command (x goes to sx(x)).
If someone has a suggestion for a better method I'd like to hear about it. There is one thing about my solution though that gives me a little trouble and I'm asking for a little help:
My problem is that I need a separate function for log scales, (call it lx). It's a bit of a pain because every time I want to change the scale I need to modify the annotate commands (change sx to lx) and occasionally there are many. This could easily be solved in the sx function if there was a way to tell what the type of scale was. For instance, is there a parameter in ggplot_build objects that describe the log/lin nature of the scale? That seems to be the best place to find it (that's where I'm pulling the scale's range) but I've looked and can not figure it out. If there was, then I could add a command to step 3 above to define the scale type, and add a tag to the sx function in step 1. That would save me some tedious work.
So, just to reiterate: does anyone know how to tell the scaling (type of scale: log or linear) of a ggplot object? such as using the ggplot_build command's object?
Suppose we have a list of pre-build plots:
linear <- ggplot(iris, aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point()
log <- linear + scale_y_log10()
linear <- ggplot_build(linear)
log <- ggplot_build(log)
plotlist <- list(a = linear, b = log)
We can grab information about their position scales in the following way:
out <- lapply(names(plotlist), function(i) {
# Grab plot, panel parameters and scales
plot <- plotlist[[i]]
params <- plot$layout$panel_params[[1]]
scales <- plot$plot$scales$scales
# Only keep (continuous) position scales
keep <- vapply(scales, function(x) {
inherits(x, "ScaleContinuousPosition")
}, logical(1))
scales <- scales[keep]
# Grab relevant transformations
out <- lapply(scales, function(scale) {
data.frame(position = scale$aesthetics[1],
# And now for the actual question:
transformation = scale$trans$name,
plot = i)
})
out <- do.call(rbind, out)
# Grab relevant ranges
ranges <- params[paste0(out$position, ".range")]
out$min <- sapply(ranges, `[`, 1)
out$max <- sapply(ranges, `[`, 2)
out
})
out <- do.call(rbind, out)
Which will give us:
out
position transformation plot min max
1 x identity a 1.8800000 4.520000
2 y identity a 4.1200000 8.080000
3 y log-10 b 0.6202605 0.910835
4 x identity b 1.8800000 4.520000
Or if you prefer a straightforward answer:
log$plot$scales$scales[[1]]$trans$name
[1] "log-10"
I try to create a list of plots of my data using a for loop to filter (="TAB_tmp2") and add the new plot in the list (="ListeGRAPH"). I think the problem comes from the difference of filter data table (="TAB_tmp2").
I have read several topics on the web about that but I can't find a solution which could works in this case.
My code :
rm(list=ls()) # delete objects
#====================================
# Create data for the example
#====================================
TAB = data.frame(Types_Mesures = c(rep(1,3),rep(2,5),rep(3,10)))
TAB$ID_mesuresParType=NA
TAB$Mesures=log(c(1:length(TAB$Types_Mesures)))
Nb_Types=length(unique(TAB$Types_Mesures)) # in the real data, the number of "Types_Mesures" can change
for (x in 1:Nb_Types) {
TAB_tmp=TAB[TAB$Types_Mesures==x,2]
TAB[TAB$Types_Mesures==x,2]=c(1:length(TAB_tmp))
}
#====================================
# List of plots
#====================================
library(gridExtra)
library(ggplot2)
INPUTDirectory= "D:/TEST/"
setwd(dir=INPUTDirectory)
ListeGRAPH <- list()
for (x in 1:Nb_Types) {
TAB_tmp2=TAB[TAB$Types_Mesures==x,]
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = TAB_tmp2$ID_mesuresParType, y = TAB_tmp2$Mesures))
# #Save graph
# png(filename = paste("TAB_plot_T",x,".png", sep = ""))
# print(ListeGRAPH[[x]])
# graphics.off()
}
gridExtra::grid.arrange(grobs = ListeGRAPH)
When I run the code, I have this error :
Error: Aesthetics must be either length 1 or the same as the data (3):
x, y
It seems that grid.arrange don't accept plots of different dimensions ?
How could I do to make the list of plots with this kind of table ? In my real data the number of "Types_Mesures" can change.
More over, I think the for loop don't allow to use a temporary variable (="TAB_tmp2") to create the list of plot but this code works when I save my plot in PNG files.
Thanks a lot for you help !
The problem is actually not with grid.arrange. When you're creating the plots with ggplot, you do not need to use $ for indexing of columns. So instead of:
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = TAB_tmp2$ID_mesuresParType, y = TAB_tmp2$Mesures))
you should use:
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = ID_mesuresParType, y = Mesures))
and then you will be able to plot the results using grid.arrange.
I need to wrap ggplot2 into another function, and want to be able to parse variables in the same manner that they are accepted, can someone steer me in the correct direction.
Lets say for example, we consider the below MWE.
#Load Required libraries.
library(ggplot2)
##My Wrapper Function.
mywrapper <- function(data,xcol,ycol,colorVar){
writeLines("This is my wrapper")
plot <- ggplot(data=data,aes(x=xcol,y=ycol,color=colorVar)) + geom_point()
print(plot)
return(plot)
}
Dummy Data:
##Demo Data
myData <- data.frame(x=0,y=0,c="Color Series")
Existing Usage which executes without hassle:
##Example of Original Function Usage, which executes as expected
plot <- ggplot(data=myData,aes(x=x,y=y,color=c)) + geom_point()
print(plot)
Objective usage syntax:
##Example of Intended Usage, which Throws Error ----- "object 'xcol' not found"
mywrapper(data=myData,xcol=x,ycol=y,colorVar=c)
The above gives an example of the 'original' usage by the ggplot2 package, and, how I would like to wrap it up in another function. The wrapper however, throws an error.
I am sure this applies to many other applications, and it has probably been answered a thousand times, however, I am not sure what this subject is 'called' within R.
The problem here is that ggplot looks for a column named xcol in the data object. I would recommend to switch to using aes_string and passing the column names you want to map using a string, e.g.:
mywrapper(data = myData, xcol = "x", ycol = "y", colorVar = "c")
And modify your wrapper accordingly:
mywrapper <- function(data, xcol, ycol, colorVar) {
writeLines("This is my wrapper")
plot <- ggplot(data = data, aes_string(x = xcol, y = ycol, color = colorVar)) + geom_point()
print(plot)
return(plot)
}
Some remarks:
Personal preference, I use a lot of spaces around e.g. x = 1, for me this greatly improves the readability. Without spaces the code looks like a big block.
If you return the plot to outside the function, I would not print it inside the function, but just outside the function.
This is just an addition to the original answer, and I do know that this is quite an old post, but just as an addition:
The original answer provides the following code to execute the wrapper:
mywrapper(data = "myData", xcol = "x", ycol = "y", colorVar = "c")
Here, data is provided as a character string. To my knowledge this will not execute correctly. Only the variables within the aes_string are provided as character strings, while the data object is passed to the wrapper as an object.
Someone pointed out to me that there's a different way to specify the data and the aesthetics in ggplot2 as below. I've never seen this -- in all the books, docs, data is always a data frame and inside aes are the names of the variables. What's this dot syntax?
y <- rnorm(100) ; x <- rnorm(100)
m <- lm(y ~ x)
library(ggplot2)
ggplot(data = m, aes(.resid, .fitted)) + geom_point()
Upgrade comment
ggplot is calling fortify on the lm object, which produces a dataframe that is then passed to ggplot.data.frame.
To see the code use
ggplot2:::ggplot.default
#function (data = NULL, mapping = aes(), ..., environment = parent.frame())
#{
# ggplot.data.frame(fortify(data, ...), mapping, environment = environment)
#}
#<environment: namespace:ggplot2>
As for fortify it coerces various models and R objects to dataframes. Have a look at methods(fortify).
You can directly see the results of fortify
ff <- fortify(m)
names(ff)
#[1] "y" "x" ".hat" ".sigma" ".cooksd" ".fitted" ".resid" ".stdresid"
So the dot isn't doing anything clever within aes, but is actually part of the column names that fortify produces.
This is really one of two questions - either:
1) How do I store the result of a print() call [i.e. x <- print(something) ] without sending anything to current graphics output?
-or-
2) Is there a function or method in ggplot that will store a plot() call to a variable without calling plot() directly? ggplotGrob is in the ballpark, but a ggplotGrob object doesn't return a list with $data in it the same way you get when you store the result of print() to a variable.
I'm using a technique picked up from this SO answer to pull out the points of a geom_density curve, and then using that data to generate some annotations. I've outlined the issue below -- when I call this as a function, I get the undesired intermediate plot object in my pdf, along with the final plot. The goal is to get rid of that undesired plot; given that base hist() has a plot = FALSE option I was hopeful that someone who knows something more about R viewports would be able to fix my plot() call (solution #1), but any solution is fine, frankly.
library(ggplot2)
library(plyr)
demo <- function (df) {
p <- ggplot(
df
,aes(
x = rating
)
) +
geom_density()
#plot the object so we can access $data
render_plot <- plot(p + ggtitle("Don't want this plot"))
#grab just the DF for the density line
density_df <- render_plot$data[[1]]
#get the maximum density value
max_y <- ddply(density_df, "group", summarise, y = max(y))
#join that back to the data to find the matching row
anno <- join(density_df, max_y, type = 'inner')
#use this to annotate
p <- p + annotate(
geom = 'text'
,x = anno$x
,y = anno$y
,label = round(anno$density, 3)
) +
ggtitle('Keep this plot')
return(p)
}
#call to demo outputs an undesired plot to the graphics device
ex <- demo(movies[movies$Comedy ==1,])
plot(ex)
#this is problematic if you are trying to make a PDF
#a distinct name for the pdf to avoid filesystem issues
unq_name <- as.character(format(Sys.time(), "%X"))
unq_name <- gsub(':', '', unq_name)
pdf(paste(unq_name , '.pdf', sep=''))
p <- demo(movies[movies$Drama ==1,])
print(p)
dev.off()
Use ggplot_build:
render_plot <- ggplot_build(p + ggtitle("Don't want this plot"))