R: Adding Space between X-axis tick Marks in Manhattan Plot? - r

I'm currently using the qqman R package to create Manhattan plot:
library(qqman)
manhattan(gWasResults,cex.axis = 0.5)
But I want to change all of the colors in each chromosome to black like this:
manhattan(gWasResults,col = c("black","black"),cex.axis = 0.5)
If I want to add spacing between each chromosome so that you can distinguish which column of data/ or datapoints belongs to which chromosome, is there a specific plot argument I can specify in the Manhattan module to do this (see for example image below)?

An alternative would be to use ggplot2. Actually, the example graph that you gave for a desired output seems created with ggplot2.
Here are two solutions:
a) Using jittering - geom_jitter
library(qqman)
library(ggplot2)
ggplot(data = gwasResults,
aes(x = as.factor(CHR),
y = -log10(P))) +
geom_jitter(width = 0.2) + # adjusting width, impacts the spacing
labs(x = "CHR") +
# remove space between plot area and x axis
scale_y_continuous(expand = c(0, 0.1))
b) Using faceting - facet_grid
ggplot(data = gwasResults,
aes(x = BP,
y = -log10(P))) +
geom_point() +
# remove space between plot area and x axis
scale_y_continuous(expand = c(0, 0.1)) +
# facet by CHR
facet_grid(cols = vars(CHR),
space = "free_x",
scales = "free_x",
switch = "x") +
labs(x = "CHR") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
panel.spacing = unit(0.1, "cm") # adjust spacing between facets
)

Related

plot_grid function removes axis breaks from ggbreak in plots

I'm struggling with a problem:
I created two volcano plots in ggplot2, but due to the fact that I had one outlier point in both plot, I need to add y axis break for better visualization.
The problem arises when I WANT TO plot both in the same page using plot_grid from cowplot::, because it visualizes the original plot without the breaks that I set.
p<- c1 %>%
ggplot(aes(x = avg_log2FC,
y = -log10(p_val_adj),
fill = gene_type,
size = gene_type,
alpha = gene_type)) +
geom_point(shape = 21, # Specify shape and colour as fixed local parameters
colour = "black") +
geom_hline(yintercept = 0,
linetype = "dashed") +
scale_fill_manual(values = cols) +
scale_size_manual(values = sizes) +
scale_alpha_manual(values = alphas) +
scale_x_continuous(limits=c(-1.5,1.5), breaks=seq(-1.5,1.5,0.5)) +
scale_y_continuous(limits=c(0,110),breaks=seq(0,110,25))+
labs(title = "Gene expression",
x = "log2(fold change)",
y = "-log10(adjusted P-value)",
colour = "Expression \nchange") +
theme_bw() + # Select theme with a white background
theme(panel.border = element_rect(colour = "black", fill = NA, size= 0.5),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank())
p1 <- p + scale_y_break(breaks = c(30, 100))
p1
p plot without breaks:
and p1 plot with breaks:
The same I did for the second plot. But this is the result using plot_grid(p1,p3, ncol = 2)
Can you help me understanding if I'm doing something wrong? or it is just a limitation of the package?
OP, it seems in that ggbreak is not compatible with functions that arrange multiple plots, as indicated in the documentation for the package here. There does seem to be a workaround via either print() (I didn't get this to work) or aplot::plot_list(...), which did work for me. Here's an example using built-in datasets.
# setting up the plots
library(ggplot2)
library(ggbreak)
library(cowplot)
p1 <-
ggplot(mtcars, aes(x=mpg, disp)) + geom_point() +
scale_y_break(c(200, 220))
p2 <-
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +
geom_point() + scale_y_break(c(3.5, 3.7))
Plots p1 and p2 yield breaks in the y axis like you would expect, but plot_grid(p1,p2) results in the plots placed side-by-side without the y axis breaks.
The following does work to arrange the plots without disturbing the y axis breaks:
aplot::plot_list(p1,p2)

Left-adjust (hjust = 0) vertical x axis labels on facets with free scale

I have decided to rephrase this question. (Editing would have taken more time and in my opinion would also not have helped the OP.)
How can one left-adjust (hjust = 0, i.e., in text direction) over facets, when scale = 'free_x'?
I don't really think that left-adjustment of x-labels is a very necessary thing to do (long labels generally being difficult to read, and right-adjusting probably the better choice) - but I find the problem interesting enough.
I tried with empty padding to the maximum character length, but this doesn't result in the same length for all strings. Also, setting axis.text.x = element.text(margin = margin()) doesn't help. Needless to say, hjust = 0 does not help, because it is adjusting within each facet.
library(ggplot2)
diamonds$cut_label <- paste("Super Dee-Duper", as.character(diamonds$cut))
ggplot(data = diamonds, aes(cut_label, carat)) +
facet_grid(~ cut, scales = "free_x") +
theme(axis.text.x = element_text(angle = 90))
The red arrows and dashed line indicate how the labels should adjust. hjust = 0 or margins or empty padding do not result in adjustment of those labels over all facets.
Data modification from this famous question
I tried with empty padding to the maximum character length, but this
doesn't result in the same length for all strings.
This caught my attention. Actually, it would result in the same length for all strings if you padded the labels with spaces, made them all the same length, and ensured the font family was non-proportionally spaced.
First, pad the labels with spaces such that all labels have the same length. I'm going to ustilise the str_pad function from the stringr package.
library(ggplot2)
data("diamonds")
diamonds$cut_label <- paste("Super Dee-Duper", as.character(diamonds$cut))
library(stringr)
diamonds$cut_label <- str_pad(diamonds$cut_label, side="right",
width=max(nchar(diamonds$cut_label)), pad=" ")
Then, you may need to load a non-proportionally-spaced font using the extrafont package.
library(extrafont)
font_import(pattern='consola') # Or any other of your choice.
Then, run the ggplot command and specify a proportionally spaced font using the family argument.
ggplot(data = diamonds, aes(cut_label, carat)) +
facet_grid(~cut, scales = "free_x") +
theme(axis.text.x = element_text(angle = 90, family="Consolas"))
One way, and possibly the most straight forward hack, would be to annotate outside the coordinates.
Disadvantage is that the parameters would need manual adjustments (y coordinate, and plot margin), and I don't see how to automate this.
library(ggplot2)
diamonds$cut_label <- paste("Super Dee-Duper", as.character(diamonds$cut))
ann_x <- data.frame(x = unique(diamonds$cut_label), y = -16, cut = unique(diamonds$cut))
ggplot(data = diamonds, aes(cut_label, carat)) +
facet_grid(~cut, scales = "free_x") +
geom_text(data = ann_x, aes(x, y, label = x), angle = 90, hjust = 0) +
theme(
axis.text.x = element_blank(),
plot.margin = margin(t = 0.1, r = 0.1, b = 2.2, l = 0.1, unit = "in")
) +
coord_cartesian(ylim = c(0, 14), clip = "off")
Created on 2020-03-14 by the reprex package (v0.3.0)
I'd approach this by making 2 plots, one of the plot area and one of the axis labels, then stick them together with a package like cowplot. You can use some theme settings to disguise the fact that the axis labels are actually made by a geom_text.
The first plot is fairly straightforward. For the second which becomes the axis labels, use dummy data with the same variables and adjust spacing how you want via text size and scale expansion. You'll probably also want to mess with the rel_heights argument in plot_grid to change the ratio of the two charts' heights.
library(ggplot2)
library(cowplot)
p1 <- ggplot(diamonds, aes(x = cut_label, y = carat)) +
facet_grid(cols = vars(cut), scales = "free_x") +
theme(axis.text.x = element_blank()) +
labs(x = NULL)
axis <- ggplot(dplyr::distinct(diamonds, cut_label, cut), aes(x = cut_label, y = 1)) +
geom_text(aes(label = cut_label), angle = 90, hjust = 0, size = 3.5) +
facet_grid(cols = vars(cut), scales = "free_x") +
scale_x_discrete(breaks = NULL) +
scale_y_continuous(expand = expansion(add = c(0.1, 1)), breaks = NULL) +
labs(y = NULL) +
theme(strip.text = element_blank(),
axis.text.x = element_blank(),
axis.ticks = element_blank(),
panel.background = element_blank())
plot_grid(p1, axis, ncol = 1, axis = "lr", align = "v")
We can edit the text grobs after generating the plot, using library(grid).
g <- ggplot(data = diamonds, aes(cut_label, carat)) +
facet_grid(~cut, scales = "free_x") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
gt <- cowplot::as_gtable(g)
axis_grobs <- which(grepl("axis-b", gt$layout$name))
labs <- levels(factor(diamonds$cut_label))[order(levels(diamonds$cut))]
for (i in seq_along(axis_grobs)) {
gt$grobs[axis_grobs[i]][[1]] <-
textGrob(labs[i], y = unit(0, "npc"), just = "left", rot = 90, gp = gpar(fontsize = 9))
}
grid.draw(gt)

Manually change order of y axis items on complicated stacked bar chart in ggplot2

I've been stuck on an issue and can't find a solution. I've tried many suggestions on Stack Overflow and elsewhere about manually ordering a stacked bar chart, since that should be a pretty simple fix, but those suggestions don't work with the huge complicated mess of code I plucked from many places. My only issue is y-axis item ordering.
I'm making a series of stacked bar charts, and ggplot2 changes the ordering of the items on the y-axis depending on which dataframe I am trying to plot. I'm trying to make 39 of these plots and want them to all have the same ordering. I think ggplot2 only wants to plot them in ascending order of their numeric mean or something, but I'd like all of the bar charts to first display the group "Bird Advocates" and then "Cat Advocates." (This is also the order they appear in my data frame, but that ordering is lost at the coord_flip() point in plotting.)
I think that taking the data frame through so many changes is why I can't just add something simple at the end or use the reorder() function. Adding things into aes() also doesn't work, since the stacked bar chart I'm creating seems to depend on those items being exactly a certain way.
Here's one of my data frames where ggplot2 is ordering my y-axis items incorrectly, plotting "Cat Advocates" before "Bird Advocates":
Group,Strongly Opposed,Opposed,Slightly Opposed,Neutral,Slightly Support,Support,Strongly Support
Bird Advocates,0.005473026,0.010946052,0.012509773,0.058639562,0.071149335,0.31118061,0.530101642
Cat Advocates,0.04491726,0.07013396,0.03624901,0.23719464,0.09141056,0.23404255,0.28605201
And here's all the code that takes that and turns it into a plot:
library(ggplot2)
library(reshape2)
library(plotly)
#Importing data from a .csv file
data <- read.csv("data.csv", header=TRUE)
data$s.Strongly.Opposed <- 0-data$Strongly.Opposed-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Opposed <- 0-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Slightly.Opposed <- 0-data$Slightly.Opposed-.5*data$Neutral
data$s.Neutral <- 0-.5*data$Neutral
data$s.Slightly.Support <- 0+.5*data$Neutral
data$s.Support <- 0+data$Slightly.Support+.5*data$Neutral
data$s.Strongly.Support <- 0+data$Support+data$Slightly.Support+.5*data$Neutral
#to percents
data[,2:15]<-data[,2:15]*100
#melting
mdfr <- melt(data, id=c("Group"))
mdfr<-cbind(mdfr[1:14,],mdfr[15:28,3])
colnames(mdfr)<-c("Group","variable","value","start")
#remove dot in level names
mylevels<-c("Strongly Opposed","Opposed","Slightly Opposed","Neutral","Slightly Support","Support","Strongly Support")
mdfr$variable<-droplevels(mdfr$variable)
levels(mdfr$variable)<-mylevels
pal<-c("#bd7523", "#e9aa61", "#f6d1a7", "#999999", "#c8cbc0", "#65806d", "#334e3b")
ggplot(data=mdfr) +
geom_segment(aes(x = Group, y = start, xend = Group, yend = start+value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
geom_hline(yintercept = 0, color =c("#646464")) +
coord_flip() +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white")) +
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
The plot:
I think this works, you may need to play around with the axis limits/breaks:
library(dplyr)
mdfr <- mdfr %>%
mutate(group_n = as.integer(case_when(Group == "Bird Advocates" ~ 2,
Group == "Cat Advocates" ~ 1)))
ggplot(data=mdfr) +
geom_segment(aes(x = group_n, y = start, xend = group_n, yend = start + value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
scale_x_continuous(limits = c(0,3), breaks = c(1, 2), labels = c("Cat", "Bird")) +
geom_hline(yintercept = 0, color =c("#646464")) +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
coord_flip() +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white"))+
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
produces this plot:
You want to factor the 'Group' variable in the order by which you want the bars to appear.
mdfr$Group <- factor(mdfr$Group, levels = c("Bird Advocates", "Cat Advocates")

decrease size of dendogram (or y-axis) ggplot

I have this code for a dendrogram. How can I decrease the size of dendrogram (or y-axis)?
I am using this code as example. In my dataset, I have large labels so I do not have space enough to include it. For that reason, I would like to reduce the space used for y axis, decrease the distance between 0 and 150. Also, when I save the figure as tiff, most of figure is the dendogram and I can not see labels clearly.
df <- USArrests # really bad idea to muck up internal datasets
labs <- paste("sta_",1:50,sep="") # new labels
rownames(df) <- labs # set new row names
library(ggplot2)
library(ggdendro)
hc <- hclust(dist(df), "ave") # heirarchal clustering
dendr <- dendro_data(hc, type="rectangle") # convert for ggplot
clust <- cutree(hc,k=2) # find 2 clusters
clust.df <- data.frame(label=names(clust), cluster=factor(clust))
# dendr[["labels"]] has the labels, merge with clust.df based on label column
dendr[["labels"]] <- merge(dendr[["labels"]],clust.df, by="label")
# plot the dendrogram; note use of color=cluster in geom_text(...)
ggplot() +
geom_segment(data=segment(dendr), aes(x=x, y=y, xend=xend, yend=yend)) +
geom_text(data=label(dendr),
aes(x, y, label=label, hjust=0, color=cluster),
size=3) +
coord_flip() +
scale_y_reverse(expand=c(0.2, 0)) +
theme(axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
axis.text.y=element_blank(),
axis.title.y=element_blank(),
panel.background=element_rect(fill="white"),
panel.grid=element_blank())
How can I decrease the size of dendogram similar than this heatmap?
(source: r-graph-gallery.com)
Thanks you so much
For flexibility, I recommend putting the dendrogram labels on the x-axis itself, rather than text labels within the plot. Otherwise no matter what values you choose for expand in the y-axis, part of the labels could be cut off for some image sizes / dimensions.
Define colour palette for the dendrogram labels:
library(dplyr)
label.colour = label(dendr)$cluster %>%
factor(levels = levels(.),
labels = scales::hue_pal()(n_distinct(.))) %>%
as.character()
For the purpose of illustration, make some labels very long:
label.values <- forcats::fct_recode(
label(dendr)$label,
sta_45_abcdefghijklmnop = "sta_45",
sta_31_merrychristmas = "sta_31",
sta_6_9876543210 = "sta_6")
Plot:
p <- ggplot(segment(dendr)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend)) +
coord_flip() +
scale_x_continuous(breaks = label(dendr)$x,
# I'm using label.values here because I made
# some long labels for illustration. you can
# simply use `labels = label(dendr)$label`
labels = label.values,
position = "top") +
scale_y_reverse(expand = c(0, 0)) +
theme_minimal() +
theme(axis.title = element_blank(),
axis.text.y = element_text(size = rel(0.9),
color = label.colour),
panel.grid = element_blank())
p
# or if you want a color legend for the clusters
p + geom_point(data = label(dendr),
aes(x = x, y = y, color = cluster), alpha = 0) +
scale_color_discrete(name = "Cluster",
guide = guide_legend(override.aes = list(alpha = 1))) +
theme(legend.position = "bottom")
You can do this by adding a size parameter to axis.text.y like so:
theme(axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
axis.text.y=element_text(size=12),
axis.title.y=element_blank(),
panel.background=element_rect(fill="white"),
panel.grid=element_blank())

`ggplot2 - facet_grid`: Axes without ticks in interior facets

I would like to create facet_grid / facet_wrap plot with the x axis being repeated under each graph but with ticks present only on the lowest graph.
Here is an example of a plot with the x axis present only once using facet_grid
ggplot(mtcars, aes(y=mpg,x=cyl)) +
facet_grid(am~., scales="free") +
geom_point() +
theme_classic() +
theme(strip.background = element_rect(colour="white", fill="white"),
strip.text.y = element_blank())
Here is an example of a plot with the x axis present twice but with ticks both times using facet_wrap
ggplot(mtcars, aes(y=mpg, x=cyl)) +
facet_wrap(~am, ncol=1, scales="free") +
geom_point() +
theme_classic() +
theme(strip.background = element_rect(colour="white", fill="white"),
strip.text.x = element_blank())
I would like the same plot as the one just above but without the ticks on the x-axis of the upper graph. Or if you prefer, I would like the same plot as the first one but with an x-axis on the upper graph.
This is a very verbose solution, but I don't think you can get the plot you want using just the usual ggplot functions.
library(ggplot2)
library(grid)
Plot <- ggplot(mtcars, aes(y=mpg, x=cyl)) +
facet_wrap(~am, ncol=1, scales="free") +
geom_point() +
theme_classic() +
theme(strip.background = element_rect(colour="white", fill="white"),
strip.text.x = element_blank())
Switching off the top x-axis requires modifying the gtable object for the plot.
Plot.build <- ggplot_gtable(ggplot_build(Plot))
axis.pos <- grep("axis-b-1-1", Plot.build$layout$name)
num.ticks <- length(Plot.build$grobs[[axis.pos]]$children[2]$axis$grobs[[1]]$y)
This step removes the axis labels:
Plot.build$grobs[[axis.pos]]$children$axis$grobs[[2]]$children[[1]]$label <- rep("", num.ticks)
This step removes the tick marks:
Plot.build$grobs[[axes.pos]]$children[2]$axis$grobs[[1]]$y <- rep(unit(0, units="cm"), num.ticks)
Finally, the plot is generated using:
grid.draw(Plot.build)
The workaround I use to get just an axis line (no tick marks) is to use geom_hline() to fake an axis.
#make a dataframe with the y minimum for each facet
fake.axes <- data.frame(mpg = c(10, 15), #y minimum to use for axis location
am = c(0,1)) #facetting variable
#add an "axis" without ticks to upper graph using geom_hline()
ggplot(mtcars, aes(y=mpg,x=cyl)) +
facet_grid(am~., scales="free") +
geom_point() +
geom_hline(aes(yintercept = mpg), fake.axes, #dataframe with fake axes
linetype = c("solid", "blank")) + #line for top graph, blank for bottom graph
theme_classic() +
theme(strip.background = element_rect(colour="white", fill="white"),
strip.text.y = element_blank())
If you haven't used scales = "free", and all the axes are in the same location this is even simpler, you can skip making a dataframe with yintercepts for each facet and simply add
geom_hline(yintercept = 10) (or whatever your minimum is) to your plot code to add an axis line on each facet.

Resources