I want to add a table with some info that will be different in each panel within the facet.
I'm using ggplot2 and facet_grid.
say I want to add some kind of descriptive statistics to each panel, and they not necessarily the same.
these statistics are placed in a df I made for that purpose.
I found a few way to add these table to the graphs but:
as far as I concern Annotate will give me the same table for all the panels in the facet.
I would really like to use the facet_warp for the simplicity and not grid_extra...
library(datasets)
data(mtcars)
ggplot(data = mpg, aes(x = displ, y = hwy, color = drv)) +
geom_point() +
facet_wrap( ~ cyl,scales="free_y")
the place of the table is not that important to me, but I don't want it to overlap the graph.
My objective is kind of mixture between the two answers in that thread:
Adding table to ggplot with facets
The first answers (with annotate-) won't work for me since I want the table in each of the plot to be unique.)
The second answer is better, but I do not want it to overlap or hide some of the details in the graph, and in each panel the lines/scatters located in different place so I can't use it like that. I would like it to be attached just like in the annotate.
try this
library(ggplot2)
library(tibble)
library(gridExtra)
library(grid)
GeomCustom <- ggproto(
"GeomCustom",
Geom,
setup_data = function(self, data, params) {
data <- ggproto_parent(Geom, self)$setup_data(data, params)
data
},
draw_group = function(data, panel_scales, coord) {
vp <- grid::viewport(x=data$x, y=data$y)
g <- grid::editGrob(data$grob[[1]], vp=vp)
ggplot2:::ggname("geom_custom", g)
},
required_aes = c("grob","x","y")
)
geom_custom <- function(mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
na.rm = FALSE,
show.legend = NA,
inherit.aes = FALSE,
...) {
layer(
geom = GeomCustom,
mapping = mapping,
data = data,
stat = stat,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
gl <- list(tableGrob(iris[1:2,1:3]),
tableGrob(iris[1:4,1:3]),
tableGrob(iris[1:3,1:3]),
tableGrob(iris[1:2,1:2]))
dummy <- tibble(f=letters[1:4], grob = gl )
d <- tibble(x=rep(1:3, 4), f=rep(letters[1:4], each=3))
ggplot(d, aes(x,x)) +
facet_wrap(~f) +
theme_bw() +
geom_custom(data=dummy, aes(grob=grob), x = 0.5, y = 0.5)
Related
I find very difficult to put labels for sites with a DCA in a autoplot or ggplot.
I also want to differentiate the points on the autoplot/ggplot according to their groups.
This is the data and the code I used and it went well until the command for autoplot/ggplot:
library(vegan)
data(dune)
d <- vegdist(dune)
csin <- hclust(d, method = "single")
cl <- cutree(csin, 3)
dune.dca <- decorana(dune)
autoplot(dune.dca)
This is the autoplot obtained:
I am using simple coding and I tried these codes but they didn't led me anywhere:
autoplot(dune.dca, label.size = 3, data = dune, colour = cl)
ggplot(dune.dca(x=DCA1, y=DCA2,colour=cl))
ggplot(dune.dca, display = ‘site’, pch = 16, col = cl)
ggrepel::geom_text_repel(aes(dune.dca))
If anyone has a simple suggestion, it could be great.
With the added information (package) I was able to go and dig a bit deeper.
The problem is (in short) that autoplot.decorana adds the data to the specific layer (either geom_point or geom_text). This is not inherited to other layers, so adding additional layers results in blank pages.
Basically notice that one of the 2 code strings below results in an error, and note the position of the data argument:
# Error:
ggplot() +
geom_point(data = mtcars, mapping = aes_string(x = 'hp', y = 'mpg')) +
geom_label(aes(x = hp, y = mpg, label = cyl))
# Work:
ggplot(data = mtcars) +
geom_point(mapping = aes_string(x = 'hp', y = 'mpg')) +
geom_label(aes(x = hp, y = mpg, label = cyl))
ggvegan:::autoplot.decorana places data as in the example the returns an error.
I see 2 ways to get around this problem:
Extract the layers data using ggplot_build or layer_data and create an overall or single layer mapping.
Extract the code for generating the data, and create our plot manually (not using autoplot).
I honestly think the second is simpler, as we might have to extract more information to make our data sensible. By looking at the source code of ggvegan:::autoplot.decorana (simply printing it to console by leaving out brackets) we can extract the below code which generates the same data as used in the plot
ggvegan_data <- function(object, axes = c(1, 2), layers = c("species", "sites"), ...){
obj <- fortify(object, axes = axes, ...)
obj <- obj[obj$Score %in% layers, , drop = FALSE]
want <- obj$Score %in% c("species", "sites")
obj[want, , drop = FALSE]
}
With this we can then generate any plot that we desire, with appropriate mappings rather than layer-individual mappings
dune.plot.data <- ggvegan_data(dune.dca)
p <- ggplot(data = dune.dca, aes(x = DCA1, DCA2, colour = Score)) +
geom_point() +
geom_text(aes(label = Label), nudge_y = 0.3)
p
Which gives us what I hope is your desired output
I'm adding a color aesthetic to a faceted histogram. In the reprex below, with no color aesthetic, the histogram only show data within that facet level. However, with color defined, a baseline is added which stretches the stretches to include the range of data across all facets. Is there a way to make this not happen?
I'm looking for something similar to geom_density with trim = TRUE, but there doesn't appear to be a trim option for geom_histogram.
library(tidyverse)
data <- tibble(a = rchisq(1000, df = 3),
b = rchisq(1000, df = 1),
c = rchisq(1000, df = 10)) %>%
gather()
ggplot(data, aes(x = value)) +
geom_histogram() +
facet_wrap(~ key, ncol = 1)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(data, aes(x = value)) +
geom_histogram(color = "red") +
facet_wrap(~ key, ncol = 1)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(data, aes(x = value)) +
geom_density(color = "red", trim = TRUE) +
facet_wrap(~ key, ncol = 1)
Created on 2019-07-20 by the reprex package (v0.3.0)
geom_histogram draws its bars using using rectGrob from the grid package, and a zero-width / zero-height rectGrob is depicted as a vertical / horizontal line in the outline colour, at least in my set-up for RStudio (& OP's as well, I presume). Demonstration below:
library(grid)
r1 <- rectGrob(width = unit(0, "npc"), gp = gpar(col = "red", fill = "grey")) # zero-width
r2 <- rectGrob(height = unit(0, "npc"), gp = gpar(col = "red", fill = "grey")) # zero-height
grid.draw(r1) # depicted as a vertical line, rather than disappear completely
grid.draw(r2) # depicted as a horizontal line, rather than disappear completely
In this case, if we check the data frame associated with the histogram layer, there are many rows with ymin = ymax = 0, which are responsible for the 'baseline' effect seen in the question.
p <- ggplot(data, aes(x = value)) +
geom_histogram(color = "red") +
facet_wrap(~ key, ncol = 1)
View(layer_data(p) %>% filter(PANEL == 2)) # look at the data associated with facet panel 2
Workaround: Since the data calculations are done in StatBin's compute_group function, we can define an alternative version of the same function, with an additional step to drop the 0-count rows from the data frame completely:
# modified version of StatBin2 inherits from StatBin, except for an
# additional 2nd last line in compute_group() function
StatBin2 <- ggproto(
"StatBin2",
StatBin,
compute_group = function (data, scales, binwidth = NULL, bins = NULL,
center = NULL, boundary = NULL,
closed = c("right", "left"), pad = FALSE,
breaks = NULL, origin = NULL, right = NULL,
drop = NULL, width = NULL) {
if (!is.null(breaks)) {
if (!scales$x$is_discrete()) {
breaks <- scales$x$transform(breaks)
}
bins <- ggplot2:::bin_breaks(breaks, closed)
}
else if (!is.null(binwidth)) {
if (is.function(binwidth)) {
binwidth <- binwidth(data$x)
}
bins <- ggplot2:::bin_breaks_width(scales$x$dimension(), binwidth,
center = center, boundary = boundary,
closed = closed)
}
else {
bins <- ggplot2:::bin_breaks_bins(scales$x$dimension(), bins,
center = center, boundary = boundary,
closed = closed)
}
res <- ggplot2:::bin_vector(data$x, bins, weight = data$weight, pad = pad)
# drop 0-count bins completely before returning the dataframe
res <- res[res$count > 0, ]
res
})
Usage:
ggplot(data, aes(x = value)) +
geom_histogram(color = "red", stat = StatBin2) + # specify stat = StatBin2
facet_wrap(~ key, ncol = 1)
Another option could be using after_stat on your y aes with an ifelse to check if the mapped value is higher than 0 otherwise replace the value with NA which will make it possible to remove the baseline color like this:
library(tidyverse)
ggplot(data, aes(x = value, y = ifelse(after_stat(count) > 0, after_stat(count), NA))) +
geom_histogram(color = "red") +
facet_wrap(~ key, ncol = 1)
Created on 2023-02-15 with reprex v2.0.2
I've made this multiple histogram plot in ggplot and now I want to add a legend for both the light purple part and the dark purple part. I know the conventional way is to to it with aes, but I can't seem to figure out how I integrate this feature as one into my multiple histogram plot.
I don't shy manual labour, but more sophisticated solutions are preferred. Anyone help me out?
#dataframe
set.seed(20)
df <- data.frame(expl = rbinom(n=100, size = 1, prob=0.08),
resp = sample(50:100, size = 100, replace = T))
#graph
graph <- ggplot(data = df, aes(x = resp))
graph +
geom_histogram(fill = "#BEBADA", alpha = 0.5, bins = 10) +
geom_histogram(data = subset(df, expl == '1'), fill = "#BEBADA", bins = 10)
Your data is already in the long format that is well suited for ggplot; you just need to map expl to alpha. In general, if you find yourself making multiples of the same geom, you probably want to rethink either the shape of your data or your approach for feeding it into geoms.
library(tidyverse)
set.seed(20)
df <- data.frame(expl = rbinom(n=100, size = 1, prob=0.08),
resp = sample(50:100, size = 100, replace = T))
To map expl onto alpha, make it a factor, and then assign that to alpha inside your aes. Then you can set the alpha scale to values of 0.5 and 1.
ggplot(df, aes(x = resp, alpha = as.factor(expl))) +
geom_histogram(fill = "#bebada", bins = 10) +
scale_alpha_manual(values = c(0.5, 1))
However, differentiating by alpha is a little awkward. You could instead map to fill and use light and dark purples:
ggplot(df, aes(x = resp, fill = as.factor(expl))) +
geom_histogram(bins = 10) +
scale_fill_manual(values = c("0" = "mediumpurple1", "1" = "mediumpurple4"))
Note also that you can adjust the position of the histogram bars if you need to, by assigning geom_histogram(position = ...), where you could fill in with something such as "dodge" if that's what you'd like.
If you want a legend on the alpha value, the idea is to include it as an aesthetic rather than as a direct argument as you tried. In order to do this, a simple solution is to enrich the data frame used by ggplot:
df2 <- rbind(
cbind(df, filter="all lines"),
cbind(subset(df, expl == '1'), filter="expl==1")
)
df2 corresponds to df after appending the lines from your subset of interest (with a field filter telling from which copy each record comes)
Then, this solves your problem
ggplot(df2, aes(resp, alpha=filter)) +
geom_histogram(fill="#BEBADA", bins=10, position="identity") +
scale_alpha_discrete(range=c(.5,1))
I'm trying to use stat_ecdf() to plot cumulative successes as a function of a rank score created by a predictive model.
#libraries
require(ggplot2)
require(scales)
# fake data for reproducibility
set.seed(123)
n <- 200
df <- data.frame(model_score= rexp(n=n,rate=1:n),
obs_set= sample(c("training","validation"),n,replace=TRUE))
df$model_rank <- rank(df$model_score)/n
df$target_outcome <- rbinom(n,1,1-df$model_rank)
# Plot Gain Chart using stat_ecdf()
ggplot(subset(df,target_outcome==1),aes(x = model_rank)) +
stat_ecdf(aes(colour = obs_set), size=1) +
scale_x_continuous(limits=c(0,1), labels=percent,breaks=seq(0,1,.1)) +
xlab("Model Percentile") + ylab("Percent of Target Outcome") +
scale_y_continuous(limits=c(0,1), labels=percent) +
geom_segment(aes(x=0,y=0,xend=1,yend=1),
colour = "gray", linetype="longdash", size=1) +
ggtitle("Gain Chart")
All I want to do is force the ECDF to start at (0,0) and end at (1,1) so that there are no gaps at the beginning or end of the curve. If possible, I'd like to do it within the syntax of ggplot2, but I'd settle for a clever workaround.
#Henrik this is NOT a duplicate of this question, because I have already defined my limits with scale_x_ and _y_continuous(), and adding expand_limits() doesn't do anything. It is not the origin of the PLOT but the endpoints of the stat_ecdf() that need fixed.
Unfortunately, the definition of stat_ecdf gives no wiggle room here; it determines the endpoints internally.
There is a somewhat advanced solution. With the latest version of ggplot2 (devtools::install_github("hadley/ggplot2")), the extensibility is improved, to the point where it is possible to override this behavior, but not without some boilerplate.
stat_ecdf2 <- function(mapping = NULL, data = NULL, geom = "step",
position = "identity", n = NULL, show.legend = NA,
inherit.aes = TRUE, minval=NULL, maxval=NULL,...) {
layer(
data = data,
mapping = mapping,
stat = StatEcdf2,
geom = geom,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
stat_params = list(n = n, minval=minval,maxval=maxval),
params = list(...)
)
}
StatEcdf2 <- ggproto("StatEcdf2", StatEcdf,
calculate = function(data, scales, n = NULL, minval=NULL, maxval=NULL, ...) {
df <- StatEcdf$calculate(data, scales, n, ...)
if (!is.null(minval)) { df$x[1] <- minval }
if (!is.null(maxval)) { df$x[length(df$x)] <- maxval }
df
}
)
Now, stat_ecdf2 will behave the same as stat_ecdf, but with an optional minval and maxval parameter. So this will do the trick:
ggplot(subset(df,target_outcome==1),aes(x = model_rank)) +
stat_ecdf2(aes(colour = obs_set), size=1, minval=0, maxval=1) +
scale_x_continuous(limits=c(0,1), labels=percent,breaks=seq(0,1,.1)) +
xlab("Model Percentile") + ylab("Percent of Target Outcome") +
scale_y_continuous(limits=c(0,1), labels=percent) +
geom_segment(aes(x=0,y=0,xend=1,yend=1),
colour = "gray", linetype="longdash", size=1) +
ggtitle("Gain Chart")
The big caveat here is that I don't know if the current extensibility model will be supported in the future; it has changed several times in the past, and the change to use "ggproto" is recent -- like July 15th 2015 recent.
As a plus, this gave me a chance to really dig into ggplot's internals, which is something that I've been meaning to do for a while.
I am trying to use geom_ribbon to mimic the behavior of geom_area
but i am not successful. would you have any hint on why the following does not work ?
I used Hadley's statement from ggplot2 geom_area web pages :
"An area plot is a special case of geom_ribbon, where the minimum of the range is fixed to 0, and the position adjustment defaults to position_stacked."
test <- expand.grid(Param = LETTERS[1:3], x = 1:5)
test$y <- test$x
# Ok
p <- ggplot(test)
p <- p + geom_area(aes(x = x, y = y, group = Param, fill = Param), alpha = 0.3)
p
# not ok - initial idea
p <- ggplot(test)
p <- p + geom_ribbon(aes(x = x, ymin = 0, ymax = y, group = Param, fill = Param), alpha = 0.3, position = position_stack())
p
further, how can I look in the code of functions coded the way geom_XXX are?
my traditional way gives the following, which is not very usefull:
> geom_ribbon
function (mapping = NULL, data = NULL, stat = "identity", position = "identity",
na.rm = FALSE, ...)
GeomRibbon$new(mapping = mapping, data = data, stat = stat, position = position,
na.rm = na.rm, ...)
Thanks for your help
Regards
Pascal
You just didn't map a variable to y in your geom_ribbon call. Adding y = y causes it to work for me. In general, geom_ribbon doesn't require a y aesthetic, but I believe it does in the case of stacking. I presume there's a well-thought out reasoning for why that is, but you never know...
Also, all the source code for ggplot2 is on github.