Related
I have a ggplot2 barchart for which I changed the axis ticks. However, the panel grid is adding additional lines that I do not want. How do I remove them?
My problems:
I only want the vertical grid lines that match the x-axis ticks.
The position dodge preserve is not working correctly due to having a group and a fill.
My code:
ggplot(byyear, aes(x = year, y = count, group = venue, colour = venue, fill = type)) +
geom_bar(stat = "identity", position=position_dodge(preserve = "single")) +
# BORDER SO I CAN DISTINGUISH THEM
scale_colour_manual(name = "Venue", values = c("#FFFFFF", "#FFFFFF")) +
# MAKE ALL YEARS APPEAR
scale_y_continuous(labels = number_format(accuracy = 1)) +
scale_x_continuous(breaks = unique(byyear$year)) +
theme(legend.position="bottom",
axis.text.x = element_text(angle = 90, hjust = 1))
The data is of the structure:
year,venue,type,count
2010,venue1,type1,163
2010,venue1,type2,18
2011,venue1,type1,16
...
The plot that I'm obtaining is the following (I removed the legend on the plot)
I have a plot with points on a polar coordinate system. Each point has an associated label, which should be shown around the plot at the given angle. This can be achieved either using axis.text or geom_text; I have used geom_text here. Unfortunately, the text labels overlap. Using position = position_jitter() apparently only allows jittering by height, but not by width (i.e., does not solve the issue). MWE:
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text(aes(x=angle, y=1.1,
label=labels),
size = 3)
I would like to jitter the labels such that they do not overlap, but stay outside the plotting area. I have also tried to use the angle argument to conditionally angle the labels to make more space, but couldn't figure out the right formula to make the angles work.
Edit: Here is another way to create the plot, using scale_x_continuous to create the labels as axis.text.x. This does, however, again lead to overlapping labels.
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
scale_x_continuous(limits = c(0, 360), expand = c(0, 0), breaks = df$angle, labels=df$labels) +
theme_minimal() +
theme(panel.grid = element_blank())
ggrepel will work well in this context.
library(ggplot2)
library(ggrepel)
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text_repel(size = 3)
This is my first question here so hope this makes sense and thank you for your time in advance!
I am trying to generate a scatterplot with the data points being the log2 expression values of genes from 2 treatments from an RNA-Seq data set. With this code I have generated the plot below:
ggplot(control, aes(x=log2_iFGFR1_uninduced, y=log2_iFGFR4_uninduced)) +
geom_point(shape = 21, color = "black", fill = "gray70") +
ggtitle("Uninduced iFGFR1 vs Uninduced iFGFR4 ") +
xlab("Uninduced iFGFR1") +
ylab("Uninduced iFGFR4") +
scale_y_continuous(breaks = seq(-15,15,by = 1)) +
scale_x_continuous(breaks = seq(-15,15,by = 1)) +
geom_abline(intercept = 1, slope = 1, color="blue", size = 1) +
geom_abline(intercept = 0, slope = 1, colour = "black", size = 1) +
geom_abline(intercept = -1, slope = 1, colour = "red", size = 1) +
theme_classic() +
theme(plot.title = element_text(hjust=0.5))
Current scatterplot:
However, I would like to change the background of the plot below the red line to a lighter red and above the blue line to a lighter blue, but still being able to see the data points in these regions. I have tried so far by using polygons in the code below.
pol1 <- data.frame(x = c(-14, 15, 15), y = c(-15, -15, 14))
pol2 <- data.frame(x = c(-15, -15, 14), y = c(-14, 15, 15))
ggplot(control, aes(x=log2_iFGFR1_uninduced, y=log2_iFGFR4_uninduced)) +
geom_point(shape = 21, color = "black", fill = "gray70") +
ggtitle("Uninduced iFGFR1 vs Uninduced iFGFR4 ") +
xlab("Uninduced iFGFR1") +
ylab("Uninduced iFGFR4") +
scale_y_continuous(breaks = seq(-15,15,by = 1)) +
scale_x_continuous(breaks = seq(-15,15,by = 1)) +
geom_polygon(data = pol1, aes(x = x, y = y), color ="pink1") +
geom_polygon(data = pol2, aes(x = x, y = y), color ="powderblue") +
geom_abline(intercept = 1, slope = 1, color="blue", size = 1) +
geom_abline(intercept = 0, slope = 1, colour = "black", size = 1) +
geom_abline(intercept = -1, slope = 1, colour = "red", size = 1) +
theme_classic() +
theme(plot.title = element_text(hjust=0.5))
New scatterplot:
However, these polygons hide my data points in this area and I don't know how to keep the polygon color but see the data points as well. I have also tried adding "fill = NA" to the geom_polygon code but this makes the area white and only keeps a colored border. Also, these polygons shift my axis limits so how do I change the axes to begin at -15 and end at 15 rather than having that extra unwanted length?
Any help would be massively appreciated as I have struggled with this for a while now and asked friends and colleagues who were unable to help.
Thanks,
Liv
Your question has two parts, so I'll answer each in turn using a dummy dataset:
df <- data.frame(x=rnorm(20,5,1), y=rnorm(20,5,1))
Stop geom_polygon from hiding geom_point
Stefan had commented with the answer to this one. Here's an illustration. Order of operations matters in ggplot. The plot you create is a result of each geom (drawing operation) performed in sequence. In your case, you have geom_polygon after geom_point, so it means that it will plot on top of geom_point. To have the points plotted on top of the polygons, just have geom_point happen after geom_polygon. Here's an illustrative example:
p <- ggplot(df, aes(x,y)) + theme_bw()
p + geom_point() + xlim(0,10) + ylim(0,10)
Now if we add a geom_rect after, it hides the points:
p + geom_point() +
geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
xlim(0,10) + ylim(0,10)
The way to prevent that is to just reverse the order of geom_point and geom_rect. It works this way for all geoms.
p + geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
geom_point() +
xlim(0,10) + ylim(0,10)
Removing whitespace between the axis and limits of the axis
The second part of your question asks about how to remove the white space between the edges of your geom_polygon and the axes. Notice how I have been using xlim and ylim to set limits? It is a shortcut for scale_x_continuous(limits=...) and scale_y_continuous(limits=...); however, we can use the argument expand= within scale_... functions to set how far to "expand" the plot before reaching the axis. You can set the expand setting for upper and lower axis limits independently, which is why this argument expects a two-component number vector, similar to the limits= argument.
Here's how to remove that whitespace:
p + geom_rect(ymin=0, ymax=5, xmin=0, xmax=5, fill='lightblue') +
geom_point() +
scale_x_continuous(limits=c(0,10), expand=c(0,0)) +
scale_y_continuous(limits=c(0,10), expand=c(0,0))
I would like to be able to extend my boxplots with additional information. Here is a working example for ggplot2:
library(ggplot2)
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
# Basic box plot
p <- ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
I would like to add additional information from a separate data frame. For example:
extra <- data.frame(dose=factor(c(0.5,1,2)), label=c("Label1", "Label2", "Label3"), n=c("n=42","n=52","n=35"))
> extra
dose label n
1 0.5 Label1 n=42
2 1 Label2 n=52
3 2 Label3 n=35
I would like to create the following figure where the information to each dose (factor) is outside the plot and aligns with each of the dose levels (I made this in powerpoint as an example):
EDIT:
I would like to ask advice for an extension of the initial question.
What about this extension where I use fill to split up dose by the two groups?
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
ToothGrowth$group <- head(rep(1:2, 100), dim(ToothGrowth)[1])
ToothGrowth$group <- factor(ToothGrowth$group)
p <- ggplot(ToothGrowth, aes(x=dose, y=len, fill=group)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
Is it possible to align data from the new data frame (extra, 6 rows) with each of the dose/group combinations?
We can use geom_text with clip = "off" inside coord_flip:
ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot() +
geom_text(
y = max(ToothGrowth$len) * 1.1,
data = extra,
aes(x = dose, label = sprintf("%s\n%s", label, n)),
hjust = 0) +
coord_flip(clip = "off") +
theme(plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"))
Explanation: We place text outside of the plot area with geom_text and disable clipping with clip = "off" inside coord_flip. Lastly, we increase the plot margin to accommodate the additional labels. You can adjust the vertical y position in the margin (so the horizontal position in the plot because of the coordinate flip) by changing the factor in y = max(ToothGrowth$len) * 1.1.
In response to your edit, here is a possibility
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
library(tidyverse)
ToothGrowth %>%
mutate(
dose = as.factor(dose),
group = as.factor(rep(1:2, nrow(ToothGrowth) / 2))) %>%
ggplot(aes(x = dose, y = len, fill = group)) +
geom_boxplot(position = position_dodge(width = 1)) +
geom_text(
data = extra %>%
mutate(
dose = as.factor(dose),
group = as.factor(group),
ymax = max(ToothGrowth$len) * 1.1),
aes(x = dose, y = ymax, label = sprintf("%s\n%s", label, n)),
position = position_dodge(width = 1),
size = 3,
hjust = 0) +
coord_flip(clip = "off", ylim = c(0, max(ToothGrowth$len))) +
theme(
plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"),
legend.position = "bottom")
A few comments:
We ensure that labels match the dodged bars by using position_dodge(with = 1) inside geom_text and geom_boxplot.
It seems that position_dodge does not like a global y (outside of aes). So we include the y position for the labels in extra and use it inside aes. As a result, we need to explicitly limit the range of the y axis. We can do that inside coord_flip with ylim = c(0, max(ToothGrowth$len)).
I have a dataframe of multiple columns (let's say n) with different range and a vector of length n. I want different x-axis for each variable to be shown below each box plot. I tried facet_grid and facet_wrap but it gives common x axis.
This is what I have tried:
d <- data.frame(matrix(rnorm(10000), ncol = 20))
point_var <- rnorm(20)
plot.data <- gather(d, variable, value)
plot.data$test_data <- rep(point_var, each = nrow(d))
ggplot(plot.data, aes(x=variable, y=value)) +
geom_boxplot() +
geom_point(aes(x=factor(variable), y = test_data), color = "red") +
coord_flip() +
xlab("Variables") +
theme(legend.position="none")
If you can live with having the text of the x axis above the plot, and having the order of the graphs a bit messed-up this could work:
library(grid)
p = ggplot(plot.data, aes(x = 0, y=value)) +
geom_boxplot() +
geom_point(aes(x = 0, y = test_data), color = "red") +
facet_wrap(~variable, scales = "free_y", switch = "y") +
xlab("Variables") +
theme(legend.position="none") + theme_bw() + theme(axis.text.x=element_blank())
print(p, vp=viewport(angle=270, width = unit(.75, "npc"), height = unit(.75, "npc")))
I'm actually just creating the graph without flipping coords, so that scales = 'free_y' works, swithcing the position of the strip labels, and then rotating the graph.
If you don't like the text above graph (which is understandable), I would consider creating a list of single plots and then putting them together with grid.arrange.
HTH,
Lorenzo