Cannot save plots as pdf when ggplot function is called inside a function - r

I am going to plot a boxplot from a 4-column matrix pl1 using ggplot with dots on each box. The instruction for plotting is like this:
p1 <- ggplot(pl1, aes(x=factor(Edge_n), y=get(make.names(y_label)), ymax=max(get(make.names(y_label)))*1.05))+
geom_boxplot(aes(fill=method), outlier.shape= NA)+
theme(text = element_text(size=20), aspect.ratio=1)+
xlab("Number of edges")+
ylab(y_label)+
scale_fill_manual(values=color_box)+
geom_point(aes(x=factor(Edge_n), y=get(make.names(true_des)), ymax=max(get(make.names(true_des)))*1.05, color=method),
position = position_dodge(width=0.75))+
scale_color_manual(values=color_pnt)
Then, I use print(p1) to print it on an opened pdf. However, this does not work for me and I get the below error:
Error in make.names(true_des) : object 'true_des' not found
Does anyone can help?

Your example is not very clear because you give a call but you don't show the values of your variables so it's really hard to figure out what you're trying to do (for instance, is method the name of a column in the data frame pl1, or is it a variable (and if it's a variable, what is its type? string? name?)).
Nonetheless, here's an example that should help set you on the way to doing what you want:
Try something like this:
pl1 <- data.frame(Edge_n = sample(5, 20, TRUE), foo = rnorm(20), bar = rnorm(20))
y_label <- 'foo'
ax <- do.call(aes, list(
x=quote(factor(Edge_n)),
y=as.name(y_label),
ymax = substitute(max(y)*1.05, list(y=as.name(y_label)))))
p1 <- ggplot(pl1) + geom_boxplot(ax)
print(p1)
This should get you started to figuring out the rest of what you're trying to do.
Alternately (a different interpretation of your question) is that you may be running into a problem with the environment in which aes evaluates its arguments. See https://github.com/hadley/ggplot2/issues/743 for details. If this is the issue, then the answer might to override the default value of the environment argument to aes, for instance: aes(x=factor(Edge_n), y=get(make.names(y_label)), ymax=max(get(make.names(y_label)))*1.05, environment=environment())

Related

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Why aren't any points showing up in the qqcomp function when using plotstyle="ggplot"?

I want to compare the fit of different distributions to my data in a single plot. The qqcomp function from the fitdistrplus package pretty much does exactly what I want to do. The only problem I have however, is that it's mostly written using base R plot and all my other plots are written in ggplot2. I basically just want to customize the qqcomp plots to look like they have been made in ggplot2.
From the documentation (https://www.rdocumentation.org/packages/fitdistrplus/versions/1.0-14/topics/graphcomp) I get that this is totally possible by setting plotstyle="ggplot". If I do this however, no points are showing up on the plot, even though it worked perfectly without the plotstyle argument. Here is a little example to visualize my problem:
library(fitdistrplus)
library(ggplot2)
set.seed(42)
vec <- rgamma(100, shape=2)
fit.norm <- fitdist(vec, "norm")
fit.gamma <- fitdist(vec, "gamma")
fit.weibull <- fitdist(vec, "weibull")
model.list <- list(fit.norm, fit.gamma, fit.weibull)
qqcomp(model.list)
This gives the following output:
While this:
qqcomp(model.list, plotstyle="ggplot")
gives the following output:
Why are the points not showing up? Am I doing something wrong here or is this a bug?
EDIT:
So I haven't figured out why this doesn't work, but there is a pretty easy workaround. The function call qqcomp(model.list, plotstyle="ggplot") still returns an ggplot object, which includes the data used to make the plot. Using that data one can easily write an own plot function that does exactly what one wants. It's not very elegant, but until someone finds out why it's not working as expected I will just use this method.
I was able to reproduce your error and indeed, it's really intriguing. Maybe, you should contact developpers of this package to mention this bug.
Otherwise, if you want to reproduce this qqplot using ggplot and stat_qq, passing the corresponding distribution function and the parameters associated (stored in $estimate):
library(ggplot2)
df = data.frame(vec)
ggplot(df, aes(sample = vec))+
stat_qq(distribution = qgamma, dparams = as.list(fit.gamma$estimate), color = "green")+
stat_qq(distribution = qnorm, dparams = as.list(fit.norm$estimate), color = "red")+
stat_qq(distribution = qweibull, dparams = as.list(fit.weibull$estimate), color = "blue")+
geom_abline(slope = 1, color = "black")+
labs(title = "Q-Q Plots", x = "Theoritical quantiles", y = "Empirical quantiles")
Hope it will help you.

Basic Calculations with stat_functions -- Plotting hazard functions

I am currently trying to plot some density distributions functions with R's ggplot2. I have the following code:
f <- stat_function(fun="dweibull",
args=list("shape"=1),
"x" = c(0,10))
stat_F <- stat_function(fun="pweibull",
args=list("shape"=1),
"x" = c(0,10))
S <- function() 1 - stat_F
h <- function() f / S
wei_h <- ggplot(data.frame(x=c(0,10))) +
stat_function(fun=h) +
...
Basically I want to plot hazard functions based on a Weibull Distribution with varying parameters, meaning I want to plot:
The above code gives me this error:
Computation failed in stat_function():
unused argument (x_trans)
I also tried to directly use
S <- 1 - stat_function(fun="pweibull", ...)
instead of above "workaround" with the custom function construction. This threw another error, since I was trying to do numeric arithmetics on an object:
non-numeric argument for binary operator
I get that error, but I have no idea for a solution.
I have done some research, but without success. I feel like this should be straightforward. Also I would like to do it "manually" as much as possible, but if there is no simple way to do this, then a packaged solution is just fine aswell.
Thanks in advance for any suggestions!
PS: I basically want to recreate the graph you can find in Kiefer, 1988 on page 10 of the linked PDF file.
Three comments:
stat_function is a function statistic for ggplot2, you cannot divide two stat_function expressions by each other or otherwise use them in mathematical expressions, as in S <- 1 - stat_function(fun="pweibull", ...). That's a fundamental misunderstanding of what stat_function is. stat_function always needs to be added to a ggplot2 plot, as in the example below.
The fun argument for stat_function takes a function as an argument, not a string. You can define functions on the fly if you need ones that don't exist already.
You need to set up an aesthetic mapping, via the aes function.
This code works:
args = list("shape" = 1.2)
ggplot(data.frame(x = seq(0, 10, length.out = 100)), aes(x)) +
stat_function(fun = dweibull, args = args, color = "red") +
stat_function(fun = function(...){1-pweibull(...)}, args = args, color = "green") +
stat_function(fun = function(...){dweibull(...)/(1-pweibull(...))},
args = args, color = "blue")

Pass function argument to ggplot label [duplicate]

I need to wrap ggplot2 into another function, and want to be able to parse variables in the same manner that they are accepted, can someone steer me in the correct direction.
Lets say for example, we consider the below MWE.
#Load Required libraries.
library(ggplot2)
##My Wrapper Function.
mywrapper <- function(data,xcol,ycol,colorVar){
writeLines("This is my wrapper")
plot <- ggplot(data=data,aes(x=xcol,y=ycol,color=colorVar)) + geom_point()
print(plot)
return(plot)
}
Dummy Data:
##Demo Data
myData <- data.frame(x=0,y=0,c="Color Series")
Existing Usage which executes without hassle:
##Example of Original Function Usage, which executes as expected
plot <- ggplot(data=myData,aes(x=x,y=y,color=c)) + geom_point()
print(plot)
Objective usage syntax:
##Example of Intended Usage, which Throws Error ----- "object 'xcol' not found"
mywrapper(data=myData,xcol=x,ycol=y,colorVar=c)
The above gives an example of the 'original' usage by the ggplot2 package, and, how I would like to wrap it up in another function. The wrapper however, throws an error.
I am sure this applies to many other applications, and it has probably been answered a thousand times, however, I am not sure what this subject is 'called' within R.
The problem here is that ggplot looks for a column named xcol in the data object. I would recommend to switch to using aes_string and passing the column names you want to map using a string, e.g.:
mywrapper(data = myData, xcol = "x", ycol = "y", colorVar = "c")
And modify your wrapper accordingly:
mywrapper <- function(data, xcol, ycol, colorVar) {
writeLines("This is my wrapper")
plot <- ggplot(data = data, aes_string(x = xcol, y = ycol, color = colorVar)) + geom_point()
print(plot)
return(plot)
}
Some remarks:
Personal preference, I use a lot of spaces around e.g. x = 1, for me this greatly improves the readability. Without spaces the code looks like a big block.
If you return the plot to outside the function, I would not print it inside the function, but just outside the function.
This is just an addition to the original answer, and I do know that this is quite an old post, but just as an addition:
The original answer provides the following code to execute the wrapper:
mywrapper(data = "myData", xcol = "x", ycol = "y", colorVar = "c")
Here, data is provided as a character string. To my knowledge this will not execute correctly. Only the variables within the aes_string are provided as character strings, while the data object is passed to the wrapper as an object.

Ggplot does not show plots in sourced function

I've been trying to draw two plots using R's ggplot library in RStudio. Problem is, when I draw two within one function, only the last one displays (in RStudio's "plots" view) and the first one disappears. Even worse, when I run ggsave() after each plot - which saves them to a file - neither of them appear (but the files save as expected). However, I want to view what I've saved in the plots as I was able to before.
Is there a way I can both display what I'll be plotting in RStudio's plots view and also save them? Moreover, when the plots are not being saved, why does the display problem happen when there's more than one plot? (i.e. why does it show the last one but not the ones before?)
The code with the plotting parts are below. I've removed some parts because they seem unnecessary (but can add them if they are indeed relevant).
HHIplot = ggplot(pergame)
# some ggplot geoms and misc. here
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
HHIAvePlot = ggplot(AveHHI, aes(x = AveHHI$n_brokers))
# some ggplot geoms and misc. here
ggsave(paste("Average HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I've already taken a look here and here but neither have helped. Adding a print(HHIplot) or print(HHIAvePlot) after the ggsave() lines has not displayed the plot.
Many thanks in advance.
Update 1: The solution suggested below didn't work, although it works for the answer's sample code. I passed the ggplot objects to .Globalenv and print() gives me an empty gray box on the plot area (which I imagine is an empty ggplot object with no layers). I think the issue might lie in some of the layers or manipulators I have used, so I've brought the full code for one ggplot object below. Any thoughts? (Note: I've tried putting the assign() line in all possible locations in relation to ggsave() and ggplot().)
HHIplot = ggplot(pergame)
HHIplot +
geom_point(aes(x = pergame$n_brokers, y = pergame$HHI)) +
scale_y_continuous(limits = c(0,10000)) +
scale_x_discrete(breaks = gameSizes) +
labs(title = paste("HHI Index of all games,",year,"Finals"),
x = "Game Size", y = "Herfindahl-Hirschman Index") +
theme(text = element_text(size=15),axis.text.x = element_text(angle = 0, hjust = 1))
assign("HHIplot",HHIplot, envir = .GlobalEnv)
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I'll preface this by saying that the following is bad practice. It's considered bad practice to break a programming language's scoping rules for something as trivial as this, but here's how it's done anyway.
So within the body of your function you'll create both plots and put them into variables. Then you'll use ggsave() to write them out. Finally, you'll use assign() to push the variables to the global scope.
library(ggplot2)
myFun <- function() {
#some sample data that you should be passing into the function via arguments
df <- data.frame(x=1:10, y1=1:10, y2=10:1)
p1 <- ggplot(df, aes(x=x, y=y1))+geom_point()
p2 <- ggplot(df, aes(x=x, y=y2))+geom_point()
ggsave('p1.jpg', p1)
ggsave('p2.jpg', p2)
assign('p1', p1, envir=.GlobalEnv)
assign('p2', p2, envir=.GlobalEnv)
return()
}
Now, when you run myFun() it will write out your two plots to .jpg files, and also drop the plots into your global environment so that you can just run p1 or p2 on the console and they'll appear in RStudio's Plot pane.
ONCE AGAIN, THIS IS BAD PRACTICE
Good practice would be to not worry about the fact that they're not popping up in RStudio. They wrote out to files, and you know they did, so go look at them there.

Resources