How to wrap ggplot2 as a function in R - r

I want to wrap the following codes as a function
ez <- function(x,a) {
z<-x^3+1
return(z)
}
Q1 <- c(1,2,3,4,5)
ggplot(tibble(x = c(-10, 10)), aes(x)) +
map(1:length(Q1),
~stat_function(fun = ez, aes(color = paste0("sand ", .)), args=list(a = Q1[.])))
These codes develop multiple curves, but they are OVERLAP and it does not matter.
I want to generate a function like this
plot <- function(a) {
Q1 <- c(1,2,3,4,5)
ggplot(tibble(x = c(-6, 6)), aes(x)) +
map(1:length(Q1),
~stat_function(fun = ez, aes(color = paste0("sand ", .)), args=list(a = Q1[.])))
}
plot(2, 4, 3,6)

Maybe this is what your are looking for. As far as I get it you want to make a function, which you can pass a vector of parameters and which returns a plot of curves for the chosen parameters. To this end:
Pass the parameter values as a vector, i.e. put it in c(...)
In your plot function simply loop over a
Note: I adjusted the function ez to give different values (and non-overlapping curves) depending on a
ez <- function(x,a) {
z<-x^3+a^3
return(z)
}
library(ggplot2)
library(tibble)
library(purrr)
plot <- function(a) {
ggplot(tibble(x = c(-6, 6)), aes(x)) +
map(a,
~stat_function(fun = ez, aes(color = paste0("sand ", .)), args=list(a = .)))
}
plot(c(2, 4, 3, 6))

Related

Use a gradient fill under a facet wrap of density curves in ggplot in R?

Similar questions have been asked before in other forms. Some can be found here and here. However, I cant seem to adapt them when using a facet wrap displaying multiple density plots.
I tried adapting the other examples, but failed... I also tried using the ggpattern package, but when there is a large amount of data, it takes several minutes on my machine to create a plot.
I am trying to create a gradient under the density curve... but with the gradient pointing down. Something like in the example image below:
Some example data to work with:
library(ggplot2)
set.seed(321)
# create data
varNames <- c("x1", "x2", "x3")
df <- data.frame(
var = sample(varNames, 100, replace = T),
val = runif(100)
)
# create plot
ggplot(df, aes(x = val)) +
geom_density(aes(colour = var, fill = var)) +
facet_wrap(~var) +
theme_bw() +
theme(legend.position = "none")
You can use teunbrand's function, but you will need to apply it to each facet. Here simply looping over it with lapply
library(tidyverse)
library(polyclip)
#> polyclip 1.10-0 built from Clipper C++ version 6.4.0
## This is teunbrands function copied without any change!!
## from https://stackoverflow.com/a/64695516/7941188
fade_polygon <- function(x, y, n = 100) {
poly <- data.frame(x = x, y = y)
# Create bounding-box edges
yseq <- seq(min(poly$y), max(poly$y), length.out = n)
xlim <- range(poly$x) + c(-1, 1)
# Pair y-edges
grad <- cbind(head(yseq, -1), tail(yseq, -1))
# Add vertical ID
grad <- cbind(grad, seq_len(nrow(grad)))
# Slice up the polygon
grad <- apply(grad, 1, function(range) {
# Create bounding box
bbox <- data.frame(x = c(xlim, rev(xlim)),
y = c(range[1], range[1:2], range[2]))
# Do actual slicing
slice <- polyclip::polyclip(poly, bbox)
# Format as data.frame
for (i in seq_along(slice)) {
slice[[i]] <- data.frame(
x = slice[[i]]$x,
y = slice[[i]]$y,
value = range[3],
id = c(1, rep(0, length(slice[[i]]$x) - 1))
)
}
slice <- do.call(rbind, slice)
})
# Combine slices
grad <- do.call(rbind, grad)
# Create IDs
grad$id <- cumsum(grad$id)
return(grad)
}
## now here starts the change, loop over your variables. I'm creating the data frame directly instead of keeping the density object
dens <- lapply(split(df, df$var), function(x) {
dens <- density(x$val)
data.frame(x = dens$x, y = dens$y)
}
)
## we need this one for the plot, but still need the list
dens_df <- bind_rows(dens, .id = "var")
grad <- bind_rows(lapply(dens, function(x) fade_polygon(x$x, x$y)), .id = "var")
ggplot(grad, aes(x, y)) +
geom_line(data = dens_df) +
geom_polygon(aes(alpha = value, group = id),
fill = "blue") +
facet_wrap(~var) +
scale_alpha_continuous(range = c(0, 1))
Created on 2021-12-05 by the reprex package (v2.0.1)

gam plots with ggplot

I need to create some gam plots in ggplot. I can do them with the general plot function, but am unsure how to do with ggplot. Here is my code and plots with the regular plot function. I'm using the College data set from the ISLR package.
train.2 <- sample(dim(College)[1],2*dim(College)[1]/3)
train.college <- College[train.2,]
test.college <- College[-train.2,]
gam.college <- gam(Outstate~Private+s(Room.Board)+s(Personal)+s(PhD)+s(perc.alumni)+s(Expend)+s(Grad.Rate), data=train.college)
par(mfrow=c(2,2))
plot(gam.college, se=TRUE,col="blue")
See update below old answer.
Old answer:
There is an implementation of GAM plotting using ggplot2 in voxel library. Here is how you would go about it:
library(ISLR)
library(mgcv)
library(voxel)
library(tidyverse)
library(gridExtra)
data(College)
set.seed(1)
train.2 <- sample(dim(College)[1],2*dim(College)[1]/3)
train.college <- College[train.2,]
test.college <- College[-train.2,]
gam.college <- gam(Outstate~Private+s(Room.Board)+s(Personal)+s(PhD)+s(perc.alumni)+s(Expend)+s(Grad.Rate), data=train.college)
vars <- c("Room.Board", "Personal", "PhD", "perc.alumni","Expend", "Grad.Rate")
map(vars, function(x){
p <- plotGAM(gam.college, smooth.cov = x) #plot customization goes here
g <- ggplotGrob(p)
}) %>%
{grid.arrange(grobs = (.), ncol = 2, nrow = 3)}
after a bunch of errors: In plotGAM(gam.college, smooth.cov = x) :
There are one or more factors in the model fit, please consider plotting by group since plot might be unprecise
To compare to the plot.gam:
par(mfrow=c(2,3))
plot(gam.college, se=TRUE,col="blue")
You might also want to plot the observed values:
map(vars, function(x){
p <- plotGAM(gam.college, smooth.cov = x) +
geom_point(data = train.college, aes_string(y = "Outstate", x = x ), alpha = 0.2) +
geom_rug(data = train.college, aes_string(y = "Outstate", x = x ), alpha = 0.2)
g <- ggplotGrob(p)
}) %>%
{grid.arrange(grobs = (.), ncol = 3, nrow = 2)}
or per group (especially important if you used the by argument (interaction in gam).
map(vars, function(x){
p <- plotGAM(gam.college, smooth.cov = x, groupCovs = "Private") +
geom_point(data = train.college, aes_string(y = "Outstate", x = x, color= "Private"), alpha = 0.2) +
geom_rug(data = train.college, aes_string(y = "Outstate", x = x, color= "Private" ), alpha = 0.2) +
scale_color_manual("Private", values = c("#868686FF", "#0073C2FF")) +
theme(legend.position="none")
g <- ggplotGrob(p)
}) %>%
{grid.arrange(grobs = (.), ncol = 3, nrow = 2)}
Update, 08. Jan. 2020.
I currently think the package mgcViz offers superior functionality compared to the voxel::plotGAMfunction. An example using the above data set and models:
library(mgcViz)
viz <- getViz(gam.college)
print(plot(viz, allTerms = T), pages = 1)
plot customization is similar go ggplot2 syntax:
trt <- plot(viz, allTerms = T) +
l_points() +
l_fitLine(linetype = 1) +
l_ciLine(linetype = 3) +
l_ciBar() +
l_rug() +
theme_grey()
print(trt, pages = 1)
This vignette shows many more examples.

R: Loess regression produces a staircase-like graph, rather than being smoothed, after the value 10

What are possible reasons as to why this is happening? It always happens after the value 10.
A subset of the dataset around the area of interest before and after the regression was applied:
Before
After
Dataset to reproduce graph
This is the ggplot2 call that I am using to generate the graph. The smoothing span used is 0.05.
dat <- read.csv("before_loess.csv", stringsAsFactors = FALSE)
smoothed.data <- applyLoessSmooth(dat, 0.05) # dat is the dataset before being smoothed
scan.plot.data <- melt(smoothed.data, id.vars = "sample.diameters", variable.name = 'series')
scan.plot <- ggplot(data = scan.plot.data, aes(sample.diameters, value)) +
geom_line(aes(colour = series)) +
xlab("Diameters (nm)") +
ylab("Concentration (dN#/cm^2)") +
theme(plot.title = element_text(hjust = 0.5))
Function used to apply the loess filter:
applyLoessSmooth <- function(raw.data, smoothing.span) {
raw.data <- raw.data[complete.cases(raw.data),]
## response
vars <- colnames(raw.data)
## covariate
id <- 1:nrow(raw.data)
## define a loess filter function (fitting loess regression line)
loess.filter <- function (x, given.data, span) loess(formula = as.formula(paste(x, "id", sep = "~")),
data = given.data,
degree = 1,
span = span)$fitted
## apply filter column-by-column
loess.graph.data <- as.data.frame(lapply(vars, loess.filter, given.data = raw.data, span = smoothing.span),
col.names = colnames(raw.data))
sample.rows <- length(loess.graph.data[1])
loess.graph.data <- loess.graph.data %>% mutate("sample.diameters" = raw.data$sample.diameters[1:nrow(raw.data)])
}
The first problem is simply that your data is rounded to three significant figures. Below 10, the values on your x axis scan.plot.data$sample.diameters increase in 0.01 increments, which produces a smooth curve on the chart, but after 10 they increase in 0.1 increments, which shows up as visible steps on the chart.
The second problem is that you should be regressing against the values of sample.diameters, rather than against the row numbers id. I think this is causing there to be multiple smoothed values for each distinct value of x - hence the steps. Here are a couple of suggested small modifications to your function...
applyLoessSmooth <- function(raw.data, smoothing.span) {
raw.data <- raw.data[complete.cases(raw.data),]
vars <- colnames(raw.data)
vars <- vars[vars != "sample.diameters"] #you are regressing against this, so exclude it from vars
loess.filter <- function (x, given.data, span) loess(
formula = as.formula(paste(x, "sample.diameters", sep = "~")), #not 'id'
data = given.data,
degree = 1,
span = span)$fitted
loess.graph.data <- as.data.frame(lapply(vars, loess.filter, given.data = raw.data,
span = smoothing.span),
col.names = vars) #final argument edited
loess.graph.data$sample.diameters <- raw.data$sample.diameters #simplified
return(loess.graph.data)
}
All of which seems to do the trick...
Of course, you could have just done this...
dat.melt <- melt(dat, id.vars = "sample.diameters", variable.name = 'series')
ggplot(data = dat.melt, aes(sample.diameters, value, colour=series)) +
geom_smooth(method="loess", span=0.05, se=FALSE)

Using ... function argument as input to another function

I would like to use ... to pass arguments into ggplot in a different function. For example:
dat <- data.frame(x = c(1, 2, 3), y = c(1, 2, 3))
f <- function(dat) {
ylimits = c(min(dat$x, dat$y), max(dat$x, dat$y))
g(dat, ylim = ylimits)
}
g <- function(dat, ...) {
args <- eval(substitute(alist(...)))
ggplot(dat, aes(x = x, y = y)) + geom_point() + coord_cartesian(ylim = args[['ylim']])
}
f(dat)
I tried using eval(args[['ylim']]), various combinations of quote/deparse/substitute but I haven't been able to get it to evaluate properly.
The environment of the previous function isn't passed along with the object, so if you save the call and then try to evaluate it the expression in g it won't be able to find ylimits, which only exists in f's environment.
One option is to use the lazyeval package, but it is currently being deprecated in favor of rlang, whose dots_list will do the trick nicely for you:
library(ggplot2)
dat <- data.frame(x = c(1, 2, 3), y = c(1, 2, 3))
f <- function(dat) {
ylimits = c(min(dat$x, dat$y), max(dat$x, dat$y))
g(dat, ylim = ylimits)
}
g <- function(dat, ...) {
args <- rlang::dots_list(...)
ggplot(dat, aes(x = x, y = y)) + geom_point() + coord_cartesian(ylim = eval(args[['ylim']]))
}
f(dat)

ggplot2 - Error bars using a custom function

I wish to generate error bar for each data point in ggplot2 using a generic function that extracts column names for the same using the names function. Following is a demo code:
plotfn <- function(data, xind, yind, yerr) {
yerrbar <- aes_string(ymin=names(data)[yind]-names(data)[yerr], ymin=names(data) [yind]+names(data)[yerr])
p <- ggplot(data, aes_string(x=names(data)[xind], y=names(data)[yind]) + geom_point() + geom_errorbar(yerrbar)
p
}
errdf <- data.frame('X'=rnorm(100, 2, 3), 'Y'=rnorm(100, 5, 6), 'eY'=rnorm(100))
plotfn(errdf, 1, 2, 3)
Running this gives the following error:
Error in names(data)[yind] - names(data)[yerr] :
non-numeric argument to binary operator
Any suggestions? Thanks.
You will need to pass a character string containing the - ('a-b' not 'a'-'b')
eg,
ggplot(mtcars,aes_string(y = 'mpg-disp',x = 'am')) + geom_point()
In your example
plotfn <- function(data, xind, yind, yerr) {
# subset the names now so it is slightly less typing later
yerr_names <- names(data)[c(yind,yerr)]
yerrbar <- aes_string(ymin = paste(yerr_names, collapse = '-'),
ymax = paste(yerr_names,collapse='+'))
p <- ggplot(data, aes_string(x=names(data)[xind], y=names(data)[yind])) +
geom_point() +
geom_errorbar(mapping = yerrbar)
p
}
# a slightly smaller, reproducible example
set.seed(1)
errdf <- data.frame('X'=rnorm(10, 2, 3), 'Y'=rnorm(10, 5, 6), 'eY'=rnorm(10))
plotfn(errdf, 1, 2, 3)

Resources