I am trying to make a lollipop plot that includes a text 'condition' and a value associated. The issue I am having is that, because there is so much data, the labels overlap. Is there an easy fix for this?
This is my code (and my issue):
library(ggplot2)
df <- read.table(file = '24 hpi MP BP.tsv', sep = '\t', header = TRUE)
group <- df$Name
value <- df$Bgd.count
data <- data.frame(
x=group,
y=value
)
ggplot(data, aes(x=x, y=y)) +
geom_segment( aes(x=x, xend=x, y=0, yend=y), color="skyblue") +
geom_point( color="blue", size=4, alpha=0.6) +
theme_light() +
coord_flip() +
theme(
panel.grid.major.y = element_blank(),
panel.border = element_blank(),
axis.ticks.y = element_blank()
)
I am hoping to get a clear separation on the labels
Your question does not provide a reproducible example, so here a more general answer.
The problem is that you want to plot hundreds of discrete values. That is bound to yield a crowded graphic.
your options:
reduce the labels (don’t label all axis) and show only few labels .
focus only on few important data points - I think this would be my preferred approach, as you also give your “story” more justice.
Group your values and show “aggregate values” such as means/error bars
Make your graph appropriately large (change the height of the so called graphic device)
Use facets (but this will not really help with the crowding in all cases)
Shorten your labels
Make the font smaller
Last, but definitely not least, change your visualisation strategy.
The y-axis title appears too close to the axis text.
ggplot(mpg, aes(cty, hwy)) + geom_point()
I have tried changing the value of many parameters with theme() but none seems to help.
From ggplot2 2.0.0 you can use the margin = argument of element_text() to change the distance between the axis title and the numbers. Set the values of the margin on top, right, bottom, and left side of the element.
ggplot(mpg, aes(cty, hwy)) + geom_point()+
theme(axis.title.y = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0)))
margin can also be used for other element_text elements (see ?theme), such as axis.text.x, axis.text.y and title.
addition
in order to set the margin for axis titles when the axis has a different position (e.g., with scale_x_...(position = "top"), you'll need a different theme setting - e.g. axis.title.x.top. See https://github.com/tidyverse/ggplot2/issues/4343.
Based on this forum post: https://groups.google.com/forum/#!topic/ggplot2/mK9DR3dKIBU
Sounds like the easiest thing to do is to add a line break (\n) before your x axis, and after your y axis labels. Seems a lot easier (although dumber) than the solutions posted above.
ggplot(mpg, aes(cty, hwy)) +
geom_point() +
xlab("\nYour_x_Label") + ylab("Your_y_Label\n")
A solution that offers more fine-grained control than \n but is less cumbersome than adding margins is to use vjust in the theme function.
For adjusting the position on the y-axis or (x-axis) to add space, this often requires using a positive value for vjust (y-axis) or a negative value for vjust (x-axis) as in theme(axis.title.y = element_text(vjust = 2)). See a fully worked example below.
# load patchwork to show plots side-by-side
library(patchwork)
library(ggplot2)
# Plot A: just for comparison, moving titles *inward*
p1 <- ggplot(mpg, aes(cty, hwy)) +
geom_point() +
theme_gray() +
theme(
axis.title.y = element_text(vjust = -3),
axis.title.x = element_text(vjust = +3)
)
# Plot B: what we want, moving titles *outward*
p2 <- ggplot(mpg, aes(cty, hwy)) +
geom_point() +
theme_gray() +
theme(
axis.title.y = element_text(vjust = +3),
axis.title.x = element_text(vjust = -0.75)
)
# show plots side-by-side with patchwork package
p1 + p2 +
plot_annotation(tag_levels = "A")
For some reason the margin argument suggested by Didzis Elferts did not work for me. So, I used a different hack that is more flexible than adding an empty line but needs giving up the axis ticks.
myplot + theme(axis.ticks.x = element_blank(), axis.ticks.length.x = unit(3.25, "cm")
I guess, one can add the tick marks manually with geom_segment. Another possibility might be [ggalt::annotation_ticks][1]but I didn't bother trying either (note the current version of ggalt on CRAN (0.4) does not support this function, the one on github (0.6) does).
Is there a way to change the shape of the points for missing data in R? I am plotting .csv files like this one in a lollipop style.
Name,chr,Pos,Reads...ME_016,Reads...ME_017,Reads...ME_018,Reads...ME_019
cg01389728,chr10,6620395,33.82,41.38,41.38,38.46
cg01389728,chr10,6620410,0,-,-,-
cg01389728,chr10,6620430,0,0,-,-
cg01389728,chr10,6620447,0,-,0,-
cg01389728,chr10,6620478,0,-,-,-
cg01389728,chr10,6620510,28.33,29.85,25.64,28.13
cg01389728,chr10,6620520,0,0,-,0
cg01389728,chr10,6620531,0,-,50,-
Using ggplot2, my graphs are created with this:
dataset <-read.table("testset", sep=",",na.strings="-", header=TRUE)
dataset <- subset(dataset, select=c(-Name, -chr))
dataset <- melt(dataset, id.vars="Pos")
dataset$variable <- gsub("\\.\\.\\.","_",dataset$variable)
xaxes <- unique(dataset$Pos)
dataset$Pos <- as.factor(dataset$Pos)
ggplot(dataset, aes(x=Pos, y=variable,fill=cut(value, breaks=10))) + geom_point(size=4, shape=21) + geom_line() + scale_fill_discrete(labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%")) +
xlab("CpG Positions") +
ylab("Sample") +
labs(fill="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),plot.title = element_text(vjust=2),axis.title.x = element_text(vjust=-0.5),axis.title.y = element_text(vjust=1.5))
However, I want to set the shape of the missing points ("-") in the plot to an "x", (shape=4) and show them also in the legend.
I've tried approaches like:
scale_fill_manual(values=c(value, NA))
or:
scale_shape_manual(values=c(21,4))
By default, the "-" are also shown with shape 21 and grey colour. There must be a way to manipulate this? Writing a method like this might be the trick, but how to call it for the whole column?
formas <- function(x){
+ if(is.na(x)) forma <- 4
+ if(!is.na(x)) forma <- 21
+ return(forma)
+ }
This comes pretty close, I think.
ggplot(dataset, aes(x=Pos, y=variable,
color=cut(value, breaks=10),
shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
geom_line() +
scale_shape_manual(name="",values=c(Missing=4,Present=19))+
scale_color_discrete(labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%")) +
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),plot.title = element_text(vjust=2),axis.title.x = element_text(vjust=-0.5),axis.title.y = element_text(vjust=1.5))
Change are:
used color instead of fill, with shape=19 for points with data
added shape aesthetic to ggplot(...) call.
removed shape=21 from geom_point(...) call.
added scale_shape_manual(...) to define the shapes for Missing and Present, and turn off the guide label.
I know you wanted filled points with a black outline (it does look better), but when I tried that with the added shape aesthetic, the fill legend does not display the colors correctly. Try it yourself.
Here is another approach that comes closer to producing the graph you specified (circular points with black outline and fill color determined by coverage).
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
ggplot(dataset, aes(x=Pos, y=variable,
fill=cut(value, breaks=10),
shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
geom_line() +
scale_fill_manual(name="Coverage in %",
values=fill.colors,
labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%"),
drop=FALSE) +
scale_shape_manual(name="",values=c(Missing=4,Present=21),limits=c("Missing"))+
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),
plot.title = element_text(vjust=2),
axis.title.x = element_text(vjust=-0.5),
axis.title.y = element_text(vjust=1.5))+
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
The problem in the other answer with using point shape 21 and the fill aesthetic is that, while the fill colors are displayed correctly in the plot, they are not displayed correctly in the legend. One way around that is to force ggplot to set the legend fill colors using
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
Unfortunately, to do that you have to specify the fill colors manually (so that the actual fill and the override fill are the same). This code does that using
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
which creates a color palette that mimics the ggplot default. You could of course use your own color palette here.
While this does come closer to your original intent, I actually think the other answer provides a better data visualization. The black outlines around the points, while "attractive", make it much more difficult to distinguish between fill colors, especially with 10 possible colors (which is at the edge of discernability anyway).
I can't see, why this is not working:
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
ggplot(dataset, aes(x=Pos, y=variable
,color=cut(value, breaks=c(-0.01,10,20,30,40,50,60,70,80,90,100))
,shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
scale_shape_manual(name="",values=c("Missing"=4,"Present"=19),limits=c("Missing"))+
scale_color_manual(name="Coverage in %",
values=ifelse(is.na(dataset$value),"grey",fill.colors),
labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%"),drop=FALSE) +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),
plot.title = element_text(vjust=2),
axis.title.x = element_text(vjust=-0.5),
axis.title.y = element_text(vjust=1.5)) +
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
NA values are not shown anymore with an X, and instead of displaying them in "grey", the class 90-100% will be shown in grey. No error message is shown - what is the problem?
I would like to have a boxplot showing the same distribution underneath my histogram.
The code below almost works, but coord_flip() is being applied to all layers, instead of just the geom_boxplot layer.
plot1<-ggplot(newdatahistogram, aes_string(x=newdatahistogram[RawLocation])) +
xlab(GGVar) + ylab("Proportion of Instances") +
geom_histogram(aes(y=..density..), binwidth=1, colour="black", fill="white",origin=-0.5) +
scale_x_continuous(limits=c(-3,6), breaks=seq(0,5,by=1), expand=c(.01,0)) +
geom_boxplot(aes_string(x=-1, y=newdatahistogram[RawLocation])) + coord_flip()
How can I apply coord_flip() to a single layer?
Thank you!
I got it to work with a bit of a hack;
plot1 <- ggplot(newdatahistogram, aes_string(x=newdatahistogram[RawLocation], fill=(newdatahistogram[,"PQ"]))) +
xlab(GGVar) + ylab("Proportion of Observation") +
geom_histogram(aes(y=..density..), binwidth=1, colour="black", origin=-0.5) +
scale_x_continuous(limits=c(-1,6), breaks=seq(0,5,by=1), expand=c(.01,0)) +
scale_y_continuous(limits=c(-.2,1), breaks=seq(0,1,by=.2))
theme(plot.margin = unit(c(0,0,0,0), "cm"))
plot_box <- ggplot(newdatahistogram) +
geom_boxplot(aes_string(x=1, y=newdatahistogram[RawLocation])) +
scale_y_continuous(breaks=(0:5), labels=NULL, limits=c(-1,6), expand=c(.0,-.03)) +
scale_x_continuous(breaks=NULL) + xlab(NULL) + ylab(NULL) +
coord_flip() + theme_bw() +
theme(plot.margin = unit(c(0,0,.0,0), "cm"),
line=element_blank(),text=element_blank(),
axis.line = element_blank(),title=element_blank(), panel.border=theme_blank())
PB = ggplotGrob(plot_box)
plot1 <- plot1 + annotation_custom(grob=PB, xmin=-1.01, xmax=5.95, ymin=-.3,ymax=0)
This saves the rotated boxplot as a grob object and inserts it into the plot under the histogram.
I needed to play with the expansion element a bit to get the scales to line up,
but it works!
Seriously though, I think ggplot should have a horizontal boxplot available without cord_flip()... I tried to edit the boxplot code, but it was way too difficult for me!
Tried to post image, but not enough reputation
You can't: coord_flip always acts on all layers. However, you do have two alternatives:
The solution here shows how to use grid.arrange() to add a marginal histogram. (The comments in the question also link to a nice base-R way to do the same thing)
You could indicate density using a rug plot on of the four sides of the plot with plot1 + geom_rug(sides='r')
ggplot(mpg, aes(x=class, y=cty)) +
geom_boxplot() + geom_rug(sides="r")
The y-axis title appears too close to the axis text.
ggplot(mpg, aes(cty, hwy)) + geom_point()
I have tried changing the value of many parameters with theme() but none seems to help.
From ggplot2 2.0.0 you can use the margin = argument of element_text() to change the distance between the axis title and the numbers. Set the values of the margin on top, right, bottom, and left side of the element.
ggplot(mpg, aes(cty, hwy)) + geom_point()+
theme(axis.title.y = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0)))
margin can also be used for other element_text elements (see ?theme), such as axis.text.x, axis.text.y and title.
addition
in order to set the margin for axis titles when the axis has a different position (e.g., with scale_x_...(position = "top"), you'll need a different theme setting - e.g. axis.title.x.top. See https://github.com/tidyverse/ggplot2/issues/4343.
Based on this forum post: https://groups.google.com/forum/#!topic/ggplot2/mK9DR3dKIBU
Sounds like the easiest thing to do is to add a line break (\n) before your x axis, and after your y axis labels. Seems a lot easier (although dumber) than the solutions posted above.
ggplot(mpg, aes(cty, hwy)) +
geom_point() +
xlab("\nYour_x_Label") + ylab("Your_y_Label\n")
A solution that offers more fine-grained control than \n but is less cumbersome than adding margins is to use vjust in the theme function.
For adjusting the position on the y-axis or (x-axis) to add space, this often requires using a positive value for vjust (y-axis) or a negative value for vjust (x-axis) as in theme(axis.title.y = element_text(vjust = 2)). See a fully worked example below.
# load patchwork to show plots side-by-side
library(patchwork)
library(ggplot2)
# Plot A: just for comparison, moving titles *inward*
p1 <- ggplot(mpg, aes(cty, hwy)) +
geom_point() +
theme_gray() +
theme(
axis.title.y = element_text(vjust = -3),
axis.title.x = element_text(vjust = +3)
)
# Plot B: what we want, moving titles *outward*
p2 <- ggplot(mpg, aes(cty, hwy)) +
geom_point() +
theme_gray() +
theme(
axis.title.y = element_text(vjust = +3),
axis.title.x = element_text(vjust = -0.75)
)
# show plots side-by-side with patchwork package
p1 + p2 +
plot_annotation(tag_levels = "A")
For some reason the margin argument suggested by Didzis Elferts did not work for me. So, I used a different hack that is more flexible than adding an empty line but needs giving up the axis ticks.
myplot + theme(axis.ticks.x = element_blank(), axis.ticks.length.x = unit(3.25, "cm")
I guess, one can add the tick marks manually with geom_segment. Another possibility might be [ggalt::annotation_ticks][1]but I didn't bother trying either (note the current version of ggalt on CRAN (0.4) does not support this function, the one on github (0.6) does).