I have a plot with points on a polar coordinate system. Each point has an associated label, which should be shown around the plot at the given angle. This can be achieved either using axis.text or geom_text; I have used geom_text here. Unfortunately, the text labels overlap. Using position = position_jitter() apparently only allows jittering by height, but not by width (i.e., does not solve the issue). MWE:
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text(aes(x=angle, y=1.1,
label=labels),
size = 3)
I would like to jitter the labels such that they do not overlap, but stay outside the plotting area. I have also tried to use the angle argument to conditionally angle the labels to make more space, but couldn't figure out the right formula to make the angles work.
Edit: Here is another way to create the plot, using scale_x_continuous to create the labels as axis.text.x. This does, however, again lead to overlapping labels.
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
scale_x_continuous(limits = c(0, 360), expand = c(0, 0), breaks = df$angle, labels=df$labels) +
theme_minimal() +
theme(panel.grid = element_blank())
ggrepel will work well in this context.
library(ggplot2)
library(ggrepel)
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text_repel(size = 3)
Related
I would like to create a box/rectangle around a single level of a category and include the axis category text and the bar itself:
As you can see in the photo, the rectangle extends beyond the grid and into the plot area to encompass the axis text. I'm hoping for something customizable so I can draw rounded corners or not, change the color, and specify where it goes.
Here's some generic code I used to produce a plot:
ggplot(mtcars, aes(x=factor(cyl)))+
geom_bar(stat="count", width=0.7, fill="steelblue")+
theme_minimal()
Hopefully, this isn't answered somewhere already!
For rectangle use annotate with "rect"
to go over the x axis you can set the x axis to blank
then add new quasi axis with geom_text setting y to 0 or -1. play around to fit:
p <- ggplot(mtcars, aes(x=factor(cyl)))+
geom_bar(stat="count", width=0.7, fill="steelblue")+
theme_minimal()
p + annotate("rect", xmin = 0.5, xmax = 1.5, ymin = -1, ymax = 12,
alpha = 0, color= "green") +
theme(axis.text.x = element_blank(),
axis.line.x = element_blank(),
axis.ticks.x = element_blank()) +
geom_text(aes(y = -0.5, x = factor(cyl),
label = cyl)) +
labs(title="Rectangle over x axis!",
x ="cyl", y = "count")
That's what the ggforce package is great for. Here with a semi-programmatic approach to define x/y coordinates of your shape. If you intend to mark specific areas / data points, you might also want to look into ggforce::geom_mark_rect
I have also un-factorised the x.
library(tidyverse)
library(ggforce)
cyl <- 4
n_cyl4 <- table(mtcars$cyl)[1]
df_rect <- data.frame(x = c(cyl - .5, rep(cyl + .5, 2), cyl - .5), y = c(rep(-2, 2), rep(n_cyl4 + .5, 2)))
ggplot(mtcars, aes(x = cyl)) +
geom_shape(data = df_rect, aes(x, y), fill = NA, color = "black", radius = .01) +
geom_bar(stat = "count", width = 0.7, fill = "steelblue") +
scale_x_continuous(breaks = seq(4, 8, 2)) +
coord_cartesian(ylim = c(0, NA), clip = "off") +
theme_minimal()
Created on 2021-08-03 by the reprex package (v2.0.0)
This is my first question here so hope this makes sense and thank you for your time in advance!
I am trying to generate a scatterplot with the data points being the log2 expression values of genes from 2 treatments from an RNA-Seq data set. With this code I have generated the plot below:
ggplot(control, aes(x=log2_iFGFR1_uninduced, y=log2_iFGFR4_uninduced)) +
geom_point(shape = 21, color = "black", fill = "gray70") +
ggtitle("Uninduced iFGFR1 vs Uninduced iFGFR4 ") +
xlab("Uninduced iFGFR1") +
ylab("Uninduced iFGFR4") +
scale_y_continuous(breaks = seq(-15,15,by = 1)) +
scale_x_continuous(breaks = seq(-15,15,by = 1)) +
geom_abline(intercept = 1, slope = 1, color="blue", size = 1) +
geom_abline(intercept = 0, slope = 1, colour = "black", size = 1) +
geom_abline(intercept = -1, slope = 1, colour = "red", size = 1) +
theme_classic() +
theme(plot.title = element_text(hjust=0.5))
Current scatterplot:
However, I would like to change the background of the plot below the red line to a lighter red and above the blue line to a lighter blue, but still being able to see the data points in these regions. I have tried so far by using polygons in the code below.
pol1 <- data.frame(x = c(-14, 15, 15), y = c(-15, -15, 14))
pol2 <- data.frame(x = c(-15, -15, 14), y = c(-14, 15, 15))
ggplot(control, aes(x=log2_iFGFR1_uninduced, y=log2_iFGFR4_uninduced)) +
geom_point(shape = 21, color = "black", fill = "gray70") +
ggtitle("Uninduced iFGFR1 vs Uninduced iFGFR4 ") +
xlab("Uninduced iFGFR1") +
ylab("Uninduced iFGFR4") +
scale_y_continuous(breaks = seq(-15,15,by = 1)) +
scale_x_continuous(breaks = seq(-15,15,by = 1)) +
geom_polygon(data = pol1, aes(x = x, y = y), color ="pink1") +
geom_polygon(data = pol2, aes(x = x, y = y), color ="powderblue") +
geom_abline(intercept = 1, slope = 1, color="blue", size = 1) +
geom_abline(intercept = 0, slope = 1, colour = "black", size = 1) +
geom_abline(intercept = -1, slope = 1, colour = "red", size = 1) +
theme_classic() +
theme(plot.title = element_text(hjust=0.5))
New scatterplot:
However, these polygons hide my data points in this area and I don't know how to keep the polygon color but see the data points as well. I have also tried adding "fill = NA" to the geom_polygon code but this makes the area white and only keeps a colored border. Also, these polygons shift my axis limits so how do I change the axes to begin at -15 and end at 15 rather than having that extra unwanted length?
Any help would be massively appreciated as I have struggled with this for a while now and asked friends and colleagues who were unable to help.
Thanks,
Liv
Your question has two parts, so I'll answer each in turn using a dummy dataset:
df <- data.frame(x=rnorm(20,5,1), y=rnorm(20,5,1))
Stop geom_polygon from hiding geom_point
Stefan had commented with the answer to this one. Here's an illustration. Order of operations matters in ggplot. The plot you create is a result of each geom (drawing operation) performed in sequence. In your case, you have geom_polygon after geom_point, so it means that it will plot on top of geom_point. To have the points plotted on top of the polygons, just have geom_point happen after geom_polygon. Here's an illustrative example:
p <- ggplot(df, aes(x,y)) + theme_bw()
p + geom_point() + xlim(0,10) + ylim(0,10)
Now if we add a geom_rect after, it hides the points:
p + geom_point() +
geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
xlim(0,10) + ylim(0,10)
The way to prevent that is to just reverse the order of geom_point and geom_rect. It works this way for all geoms.
p + geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
geom_point() +
xlim(0,10) + ylim(0,10)
Removing whitespace between the axis and limits of the axis
The second part of your question asks about how to remove the white space between the edges of your geom_polygon and the axes. Notice how I have been using xlim and ylim to set limits? It is a shortcut for scale_x_continuous(limits=...) and scale_y_continuous(limits=...); however, we can use the argument expand= within scale_... functions to set how far to "expand" the plot before reaching the axis. You can set the expand setting for upper and lower axis limits independently, which is why this argument expects a two-component number vector, similar to the limits= argument.
Here's how to remove that whitespace:
p + geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
geom_point() +
scale_x_continuous(limits=c(0,10), expand=c(0,0)) +
scale_y_continuous(limits=c(0,10), expand=c(0,0))
I would like to be able to extend my boxplots with additional information. Here is a working example for ggplot2:
library(ggplot2)
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
# Basic box plot
p <- ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
I would like to add additional information from a separate data frame. For example:
extra <- data.frame(dose=factor(c(0.5,1,2)), label=c("Label1", "Label2", "Label3"), n=c("n=42","n=52","n=35"))
> extra
dose label n
1 0.5 Label1 n=42
2 1 Label2 n=52
3 2 Label3 n=35
I would like to create the following figure where the information to each dose (factor) is outside the plot and aligns with each of the dose levels (I made this in powerpoint as an example):
EDIT:
I would like to ask advice for an extension of the initial question.
What about this extension where I use fill to split up dose by the two groups?
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
ToothGrowth$group <- head(rep(1:2, 100), dim(ToothGrowth)[1])
ToothGrowth$group <- factor(ToothGrowth$group)
p <- ggplot(ToothGrowth, aes(x=dose, y=len, fill=group)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
Is it possible to align data from the new data frame (extra, 6 rows) with each of the dose/group combinations?
We can use geom_text with clip = "off" inside coord_flip:
ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot() +
geom_text(
y = max(ToothGrowth$len) * 1.1,
data = extra,
aes(x = dose, label = sprintf("%s\n%s", label, n)),
hjust = 0) +
coord_flip(clip = "off") +
theme(plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"))
Explanation: We place text outside of the plot area with geom_text and disable clipping with clip = "off" inside coord_flip. Lastly, we increase the plot margin to accommodate the additional labels. You can adjust the vertical y position in the margin (so the horizontal position in the plot because of the coordinate flip) by changing the factor in y = max(ToothGrowth$len) * 1.1.
In response to your edit, here is a possibility
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
library(tidyverse)
ToothGrowth %>%
mutate(
dose = as.factor(dose),
group = as.factor(rep(1:2, nrow(ToothGrowth) / 2))) %>%
ggplot(aes(x = dose, y = len, fill = group)) +
geom_boxplot(position = position_dodge(width = 1)) +
geom_text(
data = extra %>%
mutate(
dose = as.factor(dose),
group = as.factor(group),
ymax = max(ToothGrowth$len) * 1.1),
aes(x = dose, y = ymax, label = sprintf("%s\n%s", label, n)),
position = position_dodge(width = 1),
size = 3,
hjust = 0) +
coord_flip(clip = "off", ylim = c(0, max(ToothGrowth$len))) +
theme(
plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"),
legend.position = "bottom")
A few comments:
We ensure that labels match the dodged bars by using position_dodge(with = 1) inside geom_text and geom_boxplot.
It seems that position_dodge does not like a global y (outside of aes). So we include the y position for the labels in extra and use it inside aes. As a result, we need to explicitly limit the range of the y axis. We can do that inside coord_flip with ylim = c(0, max(ToothGrowth$len)).
My goal is to make a simple column chart in ggplot2 that looks like the following chart (made in Excel):
What I'm finding is that, with example data such as this (where one percentage value is very close to 100%), my options for plotting this data in ggplot2 leave something to be desired. In particular, I haven't found a way to make the following two simple things happen together:
1) Make the y-axis line end at 100%
and
2) Make the percentage labels over each bar visible
To address this issue, I've tried experimenting with different arguments to scale_y_continuous() but haven't found a way to meet both of the goals above at the same time. You can see this in the example plots and code below.
My question is: how do I expand the y scale so that my percentage labels over each data point are visible, but the y-axis line ends at 100%?
library(dplyr)
library(ggplot2)
library(scales)
example_df <- data_frame(Label = c("A", "B"),
Percent = c(0.5, 0.99))
example_plot <- example_df %>%
ggplot(aes(x = Label, y = Percent)) +
geom_bar(stat = "identity",
fill = "dodgerblue4", width = .6) +
geom_text(aes(label = percent(Percent)),
size = 3, vjust = -0.5) +
scale_x_discrete(NULL, expand = c(0, .5)) +
theme_classic()
Plot with desired y-axis line, but non-visible label over bar
Here is what happens when I set the limit on scale_y_continuous() to c(0,1):
example_plot +
scale_y_continuous(NULL, limits = c(0, 1.0), breaks = seq(0, 1, .2),
labels = function(x) scales::percent(x),
expand = c(0, 0)) +
labs(title = "Y axis line looks perfect, but the label over the bar is off")
Plot with y-axis line too long, but visible label over bar
And here is what happens when I set the limit on scale_y_continuous() to c(0,1.05):
example_plot +
scale_y_continuous(NULL, limits = c(0, 1.05), breaks = seq(0, 1, .2),
labels = function(x) scales::percent(x),
expand = c(0, 0)) +
labs(title = "Y axis line is too long, but the label over the bar is visible")
You could remove the regular axis line and then use geom_segment to create a new one:
example_df %>%
ggplot(aes(x = Label, y = Percent)) +
geom_bar(stat = "identity", fill = "dodgerblue4", width = .6) +
geom_text(aes(label = percent(Percent)), size = 3, vjust = -0.5) +
scale_x_discrete("", expand = c(0, .5)) +
scale_y_continuous("", breaks = seq(0, 1, .2), labels = percent, limits=c(0,1.05),
expand=c(0,0)) +
theme_classic() +
theme(axis.line.y=element_blank()) +
geom_segment(x=.5025, xend=0.5025, y=0, yend=1.002)
To respond to your comment: Even when it's outside the plot area, the 99% label is still being drawn, but it's "clipped", meaning that plot elements outside the plot area are masked. So, another option, still hacky, but less hacky than my original answer, is to turn off clipping so that the label appears:
library(grid)
p = example_df %>%
ggplot(aes(x = Label, y = Percent)) +
geom_bar(stat = "identity", fill = "dodgerblue4", width = .6) +
geom_text(aes(label = percent(Percent)), size = 3, vjust = -0.5) +
scale_x_discrete("", expand = c(0, .5)) +
scale_y_continuous("", breaks = seq(0, 1, .2), labels = percent, limits=c(0,1),
expand=c(0,0)) +
theme_classic() +
theme(plot.margin=unit(c(10,0,0,0),'pt'))
# Turn off clipping
pg <- ggplot_gtable(ggplot_build(p))
pg$layout$clip[pg$layout$name=="panel"] <- "off"
grid.draw(pg)
I'm trying to create a scatterplot where the points are jittered (geom_jitter), but I also want to create a black outline around each point. Currently I'm doing it by adding 2 geom_jitters, one for the fill and one for the outline:
beta <- paste("beta == ", "0.15")
ggplot(aes(x=xVar, y = yVar), data = data) +
geom_jitter(size=3, alpha=0.6, colour=my.cols[2]) +
theme_bw() +
geom_abline(intercept = 0.0, slope = 0.145950, size=1) +
geom_vline(xintercept = 0, linetype = "dashed") +
annotate("text", x = 2.5, y = 0.2, label=beta, parse=TRUE, size=5)+
xlim(-1.5,4) +
ylim(-2,2)+
geom_jitter(shape = 1,size = 3,colour = "black")
However, that results in something like this:
Because jitter randomly offsets the data, the 2 geom_jitters are not in line with each other. How do I ensure the outlines are in the same place as the fill points?
I've see threads about this (e.g. Is it possible to jitter two ggplot geoms in the same way?), but they're pretty old and not sure if anything new has been added to ggplot that would solve this issue
The code above works if, instead of using geom_jitter, I use the regular geom_point, but I have too many overlapping points for that to be useful
EDIT:
The solution in the posted answer works. However, it doesn't quite cooperate for some of my other graphs where I'm binning by some other variable and using that to plot different colours:
ggplot(aes(x=xVar, y = yVar, color=group), data = data) +
geom_jitter(size=3, alpha=0.6, shape=21, fill="skyblue") +
theme_bw() +
geom_vline(xintercept = 0, linetype = "dashed") +
scale_colour_brewer(name = "Title", direction = -1, palette = "Set1") +
xlim(-1.5,4) +
ylim(-2,2)
My group variable has 3 levels, and I want to colour each group level by a different colour in the brewer Set1 palette. The current solution just colours everything skyblue. What should I fill by to ensure I'm using the correct colour palette?
You don't actually have to use two layers; you can just use the fill aesthetic of a plotting character with a hole in it:
# some random data
set.seed(47)
df <- data.frame(x = rnorm(100), y = runif(100))
ggplot(aes(x = x, y = y), data = df) + geom_jitter(shape = 21, fill = 'skyblue')
The colour, size, and stroke aesthetics let you customize the exact look.
Edit:
For grouped data, set the fill aesthetic to the grouping variable, and use scale_fill_* functions to set color scales:
# more random data
set.seed(47)
df <- data.frame(x = runif(100), y = rnorm(100), group = sample(letters[1:3], 100, replace = TRUE))
ggplot(aes(x=x, y = y, fill=group), data = df) +
geom_jitter(size=3, alpha=0.6, shape=21) +
theme_bw() +
geom_vline(xintercept = 0, linetype = "dashed") +
scale_fill_brewer(name = "Title", direction = -1, palette = "Set1")