ggplot2 facet labels - second line is not displayed - r

I have a script that used to produce a facetted plot with strip text on multiple lines. But this does not work anymore. Below is a MWE where the strip text should be parsed from, e.g. "bold(A)\nreally~long~extra" to:
A
really long extra
The second line is cut off as you can see via the debug function. I even increased the margins but to no avail...
Any ideas what is the issue?
exmpl = data.frame(a = 1:100,
b = rep(1:5, 20),
f = factor(rep(LETTERS[1:5], each = 20))) %>%
as_tibble() %>%
mutate(f2 = paste0("bold(",f, ")\nreally~long~extra"))
ggplot(exmpl, aes(x = b, y = a)) +
facet_grid(. ~ f2, labeller = label_parsed) +
geom_point() +
theme(strip.text.x = element_text(size = 10, hjust = 0, margin = margin(.5, 0, .5, 0, "cm"), debug = T))
EDIT:
And while we are at it, I only came up with this workaround because my previous solution of using label_bquote() does not work anymore. Please have a look at this other question, maybe you can help me with this, too?

Not sure wether this works for you. But one way to achieve the desired result would be to make use of the ggtext package, which allows you to style your facet labels using HTML and CSS. To this end ggtext introduces a new theme element element_markdown. Try this:
library(ggplot2)
library(dplyr)
exmpl = data.frame(a = 1:100,
b = rep(1:5, 20),
f = factor(rep(LETTERS[1:5], each = 20))) %>%
as_tibble() %>%
mutate(f2 = paste0("<b>", f, "</b><br>", "really long extra"))
ggplot(exmpl, aes(x = b, y = a)) +
facet_grid(. ~ f2) +
geom_point() +
theme(strip.text.x = ggtext::element_markdown(size = 10, hjust = 0))
And for the second question in your former post a solution might look like so:
mylabel <- function(x) {
mutate(x, Species = paste0(letters[Species], " <i>", Species, "</i>"))
}
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point()
p + facet_grid(. ~ Species, labeller = mylabel) +
theme(strip.text.x = ggtext::element_markdown())

Related

Is there a way to add legend and count to each level for geom_point?

Is there a way to add a legend with the count to give density of each row?
Or an easier way to show it?
Thanks very much!
Couldn't even get a legend added :)
Code I used:
data %>%
ggplot(aes(x = subscribed, y = campaign)) +
geom_point () +
geom_jitter()
You could per group (subscribed) create a label which is calculated beforehand the number of n() observations and assign these as a column string. This can be used in the aes to make sure it is shown in the legend. Here is a reproducible example:
library(dplyr)
library(ggplot2)
df %>%
group_by(subscribed) %>%
mutate(count = paste0(subscribed, ' (n = ', n(), ')')) %>%
ggplot(aes(subscribed, campaign, colour = factor(count))) +
geom_jitter()
Created on 2023-01-12 with reprex v2.0.2
Created data:
df <- data.frame(campaign = runif(100),
subscribed = rep(c("no", "yes"), 50))
I found another way to show similar data to this, in a more clear manner.
However, I couldn't figure out the legend lol
The code I used was :
p <- ggplot(data = data, aes(x = subscribed, y = pdays)) +
geom_count() + scale_size_continuous(range = c(7, 30))
p + geom_text(data = ggplot_build(p)$data[[1]],
aes(x, y, label = n), color = "#ffffff") +
scale_y_continuous(breaks = seq(0, 30, by = 4))

How to add captions outside the plot on individual facets in ggplot2?

I am trying to add a caption in each facet (I am using facet_grid). I have seen these approach and this one: but nothing gives me what I need. Also, the first approach returns a warning message that I didn't find any solution:
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2.
My example:
library(ggplot2)
library(datasets)
mydf <- CO2
a <- ggplot(data = mydf, aes(x = conc)) + geom_histogram(bins = 15, alpha = 0.75) +
labs(y = "Frequency") + facet_grid(Type ~ Treatment)
a
caption_df <- data.frame(
cyl = c(4,6),
txt = c("1st=4", "2nd=6")
)
a + coord_cartesian(clip="off", ylim=c(0, 3)) +
geom_text(
data=caption_df, y=1, x=100,
mapping=aes(label=txt), hjust=0,
fontface="italic", color="red"
) +
theme(plot.margin = margin(b=25))
The idea is to have 1 caption per plot, but with this approach it repeats the caption and it is overwritten.
Is it possible to have something like this? (caption OUTSIDE the plot) (but without the previous warning)
a + labs(caption = c("nonchilled=4", "chilled=6")) + theme(plot.caption = element_text(hjust=c(0, 1)))
NOTE: This is only an example, but I may need to put long captions (sentences) for each plot.
Example:
a + labs(caption = c("This is my first caption that maybe it will be large. Color red, n= 123", "This is my second caption that maybe it will be large. Color blue, n= 22")) +
theme(plot.caption = element_text(hjust=c(1, 0)))
Does anyone know how to do it?
Thanks in advance
You need to add the same faceting variable to your additional caption data frame as are present in your main data frame to specify the facets in which each should be placed. If you want some facets unlabelled, simply have an empty string.
caption_df <- data.frame(
cyl = c(4, 6, 8, 10),
conc = c(0, 1000, 0, 1000),
Freq = -1,
txt = c("1st=4", "2nd=6", '', ''),
Type = rep(c('Quebec', 'Mississippi'), each = 2),
Treatment = rep(c('chilled', 'nonchilled'), 2)
)
a + coord_cartesian(clip="off", ylim=c(0, 3), xlim = c(0, 1000)) +
geom_text(data = caption_df, aes(y = Freq, label = txt)) +
theme(plot.margin = margin(b=25))

Normal curves on multiple histograms on a same plot

My example dataframe:
sample1 <- seq(100,157, length.out = 50)
sample2 <- seq(113, 167, length.out = 50)
sample3 <- seq(95,160, length.out = 50)
sample4 <-seq(88, 110, length.out = 50)
df <- as.data.frame(cbind(sample1, sample2, sample3, sample4))
I have managed to create histograms for these four variables, which share the same y-axis. Now I need an overlay normal curve. Based on previous posts, I've managed a density curve, but this is not what I want. This comes close, but I'd like a smooth line...
This is my current code for plotting:
df <- as.data.table(df)
new.df<-melt(df,id.vars="sample")
names(new.df)=c("sample","type","value")
cdat <- ddply(new.df, "type", summarise, value.mean=mean(value))
ggplot(data = new.df,aes(x=value)) +
geom_histogram(aes(x = value), bins = 15, colour = "black", fill = "gray") +
facet_wrap(~ type) + geom_density(aes(x = value),alpha=.2, fill="#FF6666") +
geom_vline(data=cdat, aes(xintercept=value.mean),
linetype="dashed", size=1, colour="black") +
theme_classic() +
theme(text = element_text(size = 15), element_line(size = 0.5),aspect.ratio = 0.75 )
And I found the following code, which I hoped would do the trick, but this gives me nothing:
stat_function(fun = dnorm, args = list(mean = mean(df$value), sd = sd(df$value)))
Unfortunately, stat_function doesn't play nicely with facets: it overlays the same function on each facet without taking account of the faceting variable.
One of the most common reasons I see for people posting ggplot questions on Stack Overflow is that they get lost while trying to coerce ggplot to do too much of their data manipulation. Functions like geom_smooth and geom_function are useful helpers for common tasks, but if you want to do something that is complex or uncommon, it is best to produce the data you want to plot, then plot it.
In fact, the main author of ggplot2 recommends this approach for a very similar problem to yours in this thread, saying:
I think you are better off generating the data outside of ggplot2 and then plotting it. See https://speakerdeck.com/jennybc/row-oriented-workflows-in-r-with-the-tidyverse to get started.
Hadley Wickham, 26 April 2018
So here's one way of doing that using tidyverse. You create a data frame of the dnorm for each sample and plot these using plain old geom_line.
Note that your histograms are counts, so you either need to change them to density, or multiply the dnorm output by the number of observations * the binwidth, otherwise you will just get an apparently "flat" line on the x axis, since the dnorm values will all be so small in relation to the counts:
library(plyr)
library(dplyr)
library(tidyr)
library(ggplot2)
dfn <- df %>%
pivot_longer(everything()) %>%
ddply("name", function(x) {
xvar <- seq(min(x$value), max(x$value), length.out = 100)
data.frame(value = xvar,
y = 5 * nrow(x) * dnorm(xvar, mean(x$value), sd(x$value)))
})
df %>%
pivot_longer(everything()) %>%
group_by(name) %>%
mutate(mean = mean(value), sd = sd(value)) %>%
ggplot(aes(value)) +
geom_histogram(aes(x = value), binwidth = 5,
colour = "black", fill = "gray") +
facet_wrap(~ name) +
geom_vline(aes(xintercept = mean),
linetype = "dashed", size=1, colour="black") +
geom_line(data = dfn, aes(y = y)) +
theme_classic() +
theme(text = element_text(size = 15), element_line(size = 0.5),
aspect.ratio = 0.75 )
Created on 2020-12-07 by the reprex package (v0.3.0)

Adding a single label per group in ggplot with stat_summary and text geoms

I would like to add counts to a ggplot that uses stat_summary().
I am having an issue with the requirement that the text vector be the same length as the data.
With the examples below, you can see that what is being plotted is the same label multiple times.
The workaround to set the location on the y axis has the effect that multiple labels are stacked up. The visual effect is a bit strange (particularly when you have thousands of observations) and not sufficiently professional for my purposes. You will have to trust me on this one - the attached picture doesn't fully convey the weirdness of it.
I was wondering if someone else has worked out another way. It is for a plot in shiny that has dynamic input, so text cannot be overlaid in a hardcoded fashion.
I'm pretty sure ggplot wasn't designed for the kind of behaviour with stat_summary that I am looking for, and I may have to abandon stat_summary and create a new summary dataframe, but thought I would first check if someone else has some wizardry to offer up.
This is the plot without setting the y location:
library(dplyr)
library(ggplot2)
df_x <- data.frame("Group" = c(rep("A",1000), rep("B",2) ),
"Value" = rnorm(1002))
df_x <- df_x %>%
group_by(Group) %>%
mutate(w_count = n())
ggplot(df_x, aes(x = Group, y = Value)) +
stat_summary(fun.data="mean_cl_boot", size = 1.2) +
geom_text(aes(label = w_count)) +
coord_flip() +
theme_classic()
and this is with my hack
ggplot(df_x, aes(x = Group, y = Value)) +
stat_summary(fun.data="mean_cl_boot", size = 1.2) +
geom_text(aes(y = 1, label = w_count)) +
coord_flip() +
theme_classic()
Create a df_text that has the grouped info for your labels. Then use annotate:
library(dplyr)
library(ggplot2)
set.seed(123)
df_x <- data.frame("Group" = c(rep("A",1000), rep("B",2) ),
"Value" = rnorm(1002))
df_text <- df_x %>%
group_by(Group) %>%
summarise(avg = mean(Value),
n = n()) %>%
ungroup()
yoff <- 0.0
xoff <- -0.1
ggplot(df_x, aes(x = Group, y = Value)) +
stat_summary(fun.data="mean_cl_boot", size = 1.2) +
annotate("text",
x = 1:2 + xoff,
y = df_text$avg + yoff,
label = df_text$n) +
coord_flip() +
theme_classic()
I found another way which is a little more robust for when the plot is dynamic in its ordering and filtering, and works well for faceting. More robust, because it uses stat_summary for the text.
library(dplyr)
library(ggplot2)
df_x <- data.frame("Group" = c(rep("A",1000), rep("B",2) ),
"Value" = rnorm(1002))
counts_df <- function(y) {
return( data.frame( y = 1, label = paste0('n=', length(y)) ) )
}
ggplot(df_x, aes(x = Group, y = Value)) +
stat_summary(fun.data="mean_cl_boot", size = 1.2) +
coord_flip() +
theme_classic()
p + stat_summary(geom="text", fun.data=counts_df)

Adding reference lines to a bar-plot with ggplot in R

This is a minimal example that shows the plots I am trying to make.
Data looks like this:
plot1 = data.frame(
Factor1 = as.factor(rep('A', 4)),
Factor2 = as.factor(rep(c('C', 'D'), 2)),
Factor3 = as.factor(c( rep('E', 2), rep('F', 2))),
Y = c(0.225490, 0.121958, 0.218182, 0.269789)
)
plot2 = data.frame(
Factor1 = as.factor(rep('B', 4)),
Factor2 = as.factor(rep(c('C', 'D'), 2)),
Factor3 = as.factor(c( rep('E', 2), rep('F', 2))),
Y = c(-0.058585, -0.031686, 0.013141, 0.016249)
)
While the basic code for plotting looks like this:
require(ggplot2)
require(grid)
p1 <- ggplot(data=plot1, aes(x=Factor2, y=Y, fill=factor(Factor3))) +
ggtitle('Type: A') +
coord_cartesian(ylim = c(-0.10, 0.30)) +
geom_bar(position=position_dodge(.9), width=0.5, stat='identity') +
scale_x_discrete(name='Regime',
labels=c('C', 'D')) +
scale_y_continuous('Activations') +
scale_fill_brewer(palette='Dark2', name='Background:',
breaks=c('E','F'),
labels=c('E','F')) +
theme(axis.text=element_text(size=11),
axis.title.x=element_text(size=13, vjust=-0.75),
axis.title.y=element_text(size=13, vjust=0.75),
legend.text=element_blank(),
legend.title=element_blank(),
legend.position='none',
plot.title=element_text(hjust=0.5))
p2 <- ggplot(data=plot2, aes(x=Factor2, y=Y, fill=factor(Factor3))) +
ggtitle('Type: B') +
coord_cartesian(ylim = c(-0.10, 0.30)) +
geom_bar(position=position_dodge(.9), width=0.5, stat='identity') +
scale_x_discrete(name='Regime',
labels=c('C', 'D')) +
scale_y_continuous('Activations') +
scale_fill_brewer(palette='Dark2', name='Background:',
breaks=c('E','F'),
labels=c('E','F')) +
theme(axis.text=element_text(size=11),
axis.title.x=element_text(size=13, vjust=-0.75),
axis.title.y=element_blank(),
legend.text=element_text(size=11),
legend.title=element_text(size=13),
plot.title=element_text(hjust=0.5))
pushViewport(viewport(
layout=grid.layout(1, 2, heights=unit(4, 'null'),
widths=unit(c(1,1.17), 'null'))))
print(p1, vp=viewport(layout.pos.row=1, layout.pos.col=1))
print(p2, vp=viewport(layout.pos.row=1, layout.pos.col=2))
And the figure looks like this:
However, I would need something like this:
Thick black lines are the reference values. They are constant and the Figure presents that "reference situation". However, in other plots that I need to produce bars will change but the reference values should remain the same to make the comparisons straightforward and easy. I know I should be using geom_segment() but those lines in my attempts to make this work are just missing the bars.
Any help/advice? Thanks!
I was able to do this using geom_errorbarh. For instance, with the second figure:
p1 +
geom_errorbarh(
aes(xmin = as.numeric(Factor2)-.2,xmax = as.numeric(Factor2)+.2), #+/-.2 for width
position = position_dodge(0.9), size = 2, height = 0
)
OUTPUT:
And, if I understand the other plots you describe, you can specify the reference data in those, eg data = plot1
If your references are not going to be changed, you can create a second dataset and merge it to the dataset you are going to plot.
Here, I first add plot1 and plot2. Then, I create a new dataset that will be the reference dataset.
library(dplyr)
new_df = rbind(plot1, plot2)
ref_plot = new_df
ref_plot <- ref_plot %>% rename(Ref_value = Y)
Then, now you have the new_df which is the dataset to be plot and ref_plot that contains references values for each conditions.
Instead of using grid and create two different plot that I will merge after, I preferred to use facet_wrap which put all plots on the same figure. It is much more convenient and don't require to write twice the same thing.
As mentioned by #AHart few minutes before me, you can use geom_errorbar to define your reference values on the plot. The difference is I prefere to use geom_errorbar instead of geom_errobarh.
Here is for the plot:
library(ggplot2)
new_df %>% left_join(ref_plot) %>%
ggplot(aes(x = Factor2, y = Y, fill = Factor3))+
geom_bar(stat = "identity", position = position_dodge())+
geom_errorbar(aes(ymin = Ref_value-0.00001, ymax = Ref_value+0.0001, group = Factor3), position = position_dodge(.9),width = 0.2)+
facet_wrap(.~Factor1, labeller = labeller(Factor1 = c(A = "Type A", B = "Type B"))) +
scale_x_discrete(name='Regime',
labels=c('C', 'D')) +
scale_fill_brewer(palette='Dark2', name='Background:',
breaks=c('E','F'),
labels=c('E','F')) +
theme(axis.text=element_text(size=11),
axis.title.x=element_text(size=13, vjust=-0.75),
axis.title.y=element_blank(),
legend.text=element_text(size=11),
legend.title=element_text(size=13),
plot.title=element_text(hjust=0.5))

Resources