How to reorder the results on Plot_composition Microbiome - r

I got a plot composition with different phylum with a category Other like this :
pseq <- Merge_Final_pourcent %>% aggregate_taxa(level = "Phylum")
ps1.com.fam <- microbiome::aggregate_top_taxa(pseq, "Phylum", top = 11)
ps1.com.fam.rel <- microbiome::transform(ps1.com.fam, "compositional")
svg(file="phylum.svg")
plot_composition(ps1.com.fam.rel) + theme(legend.position = "bottom") +
scale_fill_brewer("Phylum", palette = "Paired") + theme_bw() +
theme(axis.text.x = element_text(angle = 90)) +
ggtitle("Relative abundance") + theme(legend.title = element_text(size = 18))
dev.off()
I try to go down the category Other :
So I try to change my phyloseq and convert only the name other by Z_other but it deseappear from the phyloseq object and from the plot too:
essai<-as.data.frame(tax_table(ps1.com.fam.rel))
essai$unique <- sub("Other", "Z_Other", essai$unique)
essai$Phylum <- sub("Other", "Z_Other", essai$Phylum)
rownames(essai) <- sub("Other", "Z_Other", rownames(essai))
tax_table(ps1.com.fam.rel) <- as.matrix(essai)
essai<-as.data.frame(otu_table(ps1.com.fam.rel))
rownames(otu_table(ps1.com.fam.rel)) <- sub("Other", "Z_Other", rownames(otu_table(ps1.com.fam.rel)))
otu_table(ps1.com.fam.rel) <- as.matrix(essai)
svg(file="0_1phylum_BMP.svg")
plot_composition(ps1.com.fam.rel) + theme(legend.position = "bottom") +
scale_fill_brewer("Phylum", palette = "Paired") + theme_bw() +
theme(axis.text.x = element_text(angle = 90)) +
ggtitle("Relative abundance") + theme(legend.title = element_text(size = 18))
dev.off()

My suggestion would be to try to reorder the levels using factor. Perhaps this might help.
essainoo <- subset(essai, essai$Phylum!="Other")
neworder <- c(unique(essainoo$Phylum),"Other")
essai$Phylum <- factor(essai$Phylum, levels=neworder)

I know it's very late this answer, but maybe someone else can use it because it's the only forum I could find with this question.
I am not an R expert, and this way is only if you intend to generate a microbiome composition graph with top 20-30. It is super rudimentary, but it worked for me.
The order of the graph is that of the OTU table, so I simply reordered:
Phyloseq_object#otu_table[c(1,2,3,5,4),]
Where: -phyloseq_object is my phyloseq object,
-otu_table is by default that name
-c(1,2,3,5,4) my desired order.
Greetings!!!

Related

Variable geom_text is overwritten when plots saved in list [duplicate]

This question already has an answer here:
List for Multiple Plots from Loop (ggplot2) - List elements being overwritten
(1 answer)
Closed 2 years ago.
I am trying to organize several dozens of plots using ggarrange, so I have setup a loop in which I save each plot in a list. Each plot differs from each other with different data, title, etc. Everything works perfectly until I try to use geom_text to place some text inside the plot. When the plots are saved in the list, each plot inherits the geom_text from the last plot in the list. I don't know how to avoid this.
my.list=vector("list", length = 2);
dt=data.table(x=c(1,100,100000),y=c(1,100,100000))
plotname=c('first','second')
for (i in 1:length(my.list)) {
my.list[[i]]=ggplot(data = dt, aes(x = x, y = y )) + geom_point(size=1.5,aes(color=c('red'))) + labs(x=NULL, y=NULL)
+ scale_color_manual(values='red')
+ theme_bw() + theme(panel.background = element_rect(fill='light grey', colour='black'),legend.position = "none")
+ geom_text(inherit.aes=FALSE,aes(x=500, y=100000, label=paste0('NRMSE:',i))) + ggtitle(paste0(plotname[i])) + coord_equal()
+ geom_abline(slope=1)
+ scale_y_log10(breaks = c(1,10,100,1000,10000,100000),limits=c(1,100000))
+ scale_x_log10(breaks = c(1,10,100,1000,10000,1000000),limits=c(1,100000))
+ labs(x=NULL, y=NULL)
+ theme_bw() + theme(panel.background = element_rect(fill='light grey', colour='black'),legend.position = "none")
}
after this I do
plotosave=ggarrange(plotlist=my.list)
Using lapply instead of forloop works fine:
my.list <- lapply(1:2, function(i) {
ggplot(data = dt, aes(x = x, y = y )) +
geom_point(size=1.5) +
labs(x=NULL, y=NULL) +
theme_bw() +
theme(panel.background = element_rect(fill='light grey', colour='black'),
legend.position = "none") +
geom_text(inherit.aes=FALSE,aes(x=50000, y=100000,
label=paste0('NRMSE:',i))) +
ggtitle(paste0(plotname[i]))
})
ggarrange(plotlist = my.list)
Note: the issue is not with ggarrange.
Roland:
The plot is build when you print the ggplot object. Anything that is not part of the data passed will be taken from the enclosing environment at exactly that time point. If you use the iterator of a for loop in the plot, it has its last value then (or any value you change it to later on). lapply avoids the issue because of the stuff explained in the Note in its documentation.
Related post:
the problem is that ggplot() waits until you print the plot to resolve the variables in the aes() command.
I don't exactly know why this occurs but if you remove aes from geom_text it works.
library(ggplot2)
my.list = vector("list", length = 2)
dt = data.table::data.table(x=c(1,100,100000),y=c(1,100,100000))
plotname = c('first','second')
for (i in 1:length(my.list)) {
my.list[[i]]= ggplot(data = dt, aes(x = x, y = y )) +
geom_point(size=1.5) +
labs(x=NULL, y=NULL) +
theme_bw() +
theme(panel.background = element_rect(fill='light grey', colour='black'),
legend.position = "none") +
geom_text(x=50000, y=100000, label=paste0('NRMSE:',i)) +
ggtitle(paste0(plotname[i]))
}
plotosave = ggpubr::ggarrange(plotlist=my.list)

Circular tree with heatmap

This question is quite trivial but I cannot be handled nicely with.
I'm trying to plot a circular tree with a side heatmap.
I'm using ggtree but any approach ggplo2 based is welcome.
The problems that I'm not understanding well the gheatmap function.
I want:
1- names AFTER the heatmap
2- 2 text columns after heatmap (for while may have the same value, but I need to know how to add it )
3- heatmap columns name nicely handled, should we remove the columns name and use different colors scales for each? wherever the solution falls might better than the way it is now
library(tidyverse)
library(ggtree)
library(treeio)
library(tidytree)
beast_file <- system.file("examples/MCC_FluA_H3.tree", package="ggtree")
beast_tree <- read.beast(beast_file)
genotype_file <- system.file("examples/Genotype.txt", package="ggtree")
genotype <- read.table(genotype_file, sep="\t", stringsAsFactor=F)
colnames(genotype) <- sub("\\.$", "", colnames(genotype))
p <- ggtree(beast_tree, mrsd="2013-01-01",layout = "fan", open.angle = -270) +
geom_treescale(x=2008, y=1, offset=2) +
geom_tiplab(size=2)
gheatmap(p, genotype, offset=5, width=0.5, font.size=3,
colnames_angle=-45, hjust=0) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype")
Thanks in advance
UPDATE:
I found a better way to plot the name of heatmap columns.
Also, I found that the simplification of the data was useful to
clean up a little the tip labels.
Now, I just need to add two text columns after heatmap.
p <- ggtree(beast_tree)
gheatmap(
p, genotype, colnames=TRUE,
colnames_angle=90,
colnames_offset_y = 5,
colnames_position = "top",
) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype")
UPDATE 2:
A very bad improvement
I just used ggplot to create the label and merge with patchwork
library(patchwork)
p$data %>%
ggplot(aes(1, y= y, label = label )) +
geom_text(size=2) +
xlim(NA, 1) +
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()) -> adText
pp + adText
The answer according #xiangpin at GitHub.
Big offset value to geom_tiplabel:
p <- ggtree(beast_tree)
p1 <- gheatmap(
p, genotype, colnames=TRUE,
colnames_angle=-45,
colnames_offset_y = 5,
colnames_position = "bottom",
width=0.3,
hjust=0, font.size=2) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype") +
geom_tiplab(align = TRUE, linesize=0, offset = 7, size=2) +
xlim_tree(xlim=c(0, 36)) +
scale_y_continuous(limits = c(-1, NA))
p1
Using ggtreeExtra:
library(ggtreeExtra)
library(ggtree)
library(treeio)
library(ggplot2)
beast_file <- system.file("examples/MCC_FluA_H3.tree", package="ggtree")
genotype_file <- system.file("examples/Genotype.txt", package="ggtree")
tree <- read.beast(beast_file)
genotype <- read.table(genotype_file, sep="\t")
colnames(genotype) <- sub("\\.$", "", colnames(genotype))
genotype$ID <- row.names(genotype)
dat <- reshape2::melt(genotype, id.vars="ID", variable.name = "type", value.name="genotype", factorsAsStrings=FALSE)
dat$genotype <- unlist(lapply(as.vector(dat$genotype),function(x)ifelse(nchar(x)==0,NA,x)))
p <- ggtree(tree) + geom_treescale()
p2 <- p + geom_fruit(data=dat,
geom=geom_tile,
mapping=aes(y=ID, x=type, fill=genotype),
color="white") +
scale_fill_manual(values=c("steelblue", "firebrick", "darkgreen"),
na.translate=FALSE) +
geom_axis_text(angle=-45, hjust=0, size=1.5) +
geom_tiplab(align = TRUE, linesize=0, offset = 6, size=2) +
xlim_tree(xlim=c(0, 36)) +
scale_y_continuous(limits = c(-1, NA))
p2

Moving x or y axis together with tick labels to the middle of a single ggplot (no facets)

I made the following plot in Excel:
But then I thought I would make it prettier by using ggplot. I got this far:
If you're curious, the data is based on my answer here, although it doesn't really matter. The plot is a standard ggplot2 construct with some prettification, and the thick line for the x-axis through the middle is achieved with p + geom_hline(aes(yintercept=0)) (p is the ggplot object).
I feel that the axis configuration in the Excel plot is better. It emphasizes the 0 line (important when the data is money) and finding intercepts is much easier since you don't have to follow lines from all the way at the bottom. This is also how people draw axes when plotting on paper or boards.
Can the axis be moved like this in ggplot as well? I want not just the line, but the tick labels as well moved. If yes, how? If no, is the reason technical or by design? If by design, why was the decision made?
try this,
shift_axis <- function(p, y=0){
g <- ggplotGrob(p)
dummy <- data.frame(y=y)
ax <- g[["grobs"]][g$layout$name == "axis-b"][[1]]
p + annotation_custom(grid::grobTree(ax, vp = grid::viewport(y=1, height=sum(ax$height))),
ymax=y, ymin=y) +
geom_hline(aes(yintercept=y), data = dummy) +
theme(axis.text.x = element_blank(),
axis.ticks.x=element_blank())
}
p <- qplot(1:10, 1:10) + theme_bw()
shift_axis(p, 5)
I tried to change the theme's axis.text.x,but only can change hjust.
So I think you can delete axis.text.x,then use geom_text() to add.
For example:
test <- data.frame(x=seq(1,5), y=seq(-1,3))
ggplot(data=test, aes(x,y)) +
geom_line() +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank()) +
geom_text(data=data.frame(x=seq(1,5), y=rep(0,5)), label=seq(1,5), vjust=1.5)
Maybe these codes are useful.
just to complete baptiste's excellent answer with the equivalent for moving the y axis:
shift_axis_x <- function(p, x=0){
g <- ggplotGrob(p)
dummy <- data.frame(x=x)
ax <- g[["grobs"]][g$layout$name == "axis-l"][[1]]
p + annotation_custom(grid::grobTree(ax, vp = grid::viewport(x=1, width = sum(ax$height))),
xmax=x, xmin=x) +
geom_vline(aes(xintercept=x), data = dummy) +
theme(axis.text.y = element_blank(),
axis.ticks.y=element_blank())
}
As alistaire commented it can be done using geom_hline and geom_text as shown below.
df <- data.frame(YearMonth = c(200606,200606,200608,200701,200703,200605),
person1 = c('Alice','Bob','Alice','Alice','Bob','Alice'),
person2 = c('Bob','Alice','Bob','Bob','Alice','Bob'),
Event = c('event1','event2','event3','event3','event2','event4')
)
df$YM <- as.Date(paste0("01",df$YearMonth), format="%d%Y%m")
rangeYM <- range(df$YM)
ggplot()+geom_blank(aes(x= rangeYM, y = c(-1,1))) + labs(x = "", y = "") +
theme(axis.ticks = element_blank()) +
geom_hline(yintercept = 0, col = 'maroon') +
scale_x_date(date_labels = '%b-%y', date_breaks = "month", minor_breaks = NULL) +
scale_y_continuous(minor_breaks = NULL) +
geom_text(aes(x = df$YM, y = 0, label = paste(format(df$YM, "%b-%y")), vjust = 1.5), colour = "#5B7FA3", size = 3.5, fontface = "bold")

Different font faces and sizes within label text entries in ggplot2

I am building charts that have two lines in the axis text. The first line contains the group name, the second line contains that group population. I build my axis labels as a single character string with the format "LINE1 \n LINE2". Is it possible to assign different font faces and sizes to LINE1 and LINE2, even though they are contained within a single character string? I would like LINE1 to be large and bolded, and LINE2 to be small and unbolded.
Here's some sample code:
Treatment <- rep(c('T','C'),each=2)
Gender <- rep(c('Male','Female'),2)
Response <- sample(1:100,4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels <- paste(xbreaks,'\n',c('POP1','POP2'))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment, stat="identity"))
hist + geom_bar(position = "dodge") + scale_y_continuous(limits = c(0,
100), name = "") + scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(face='bold',size=12)
)
I tried this, but the result was one large, bolded entry, and one small, unbolded entry:
hist + geom_bar(position = "dodge") + scale_y_continuous(limits = c(0,
100), name = "") + scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(face=c('bold','plain'),size=c('15','10'))
)
Another possible solution is to create separate chart elements, but I don't think that ggplot2 has a 'sub-axis label' element available...
Any help would be very much appreciated.
Cheers,
Aaron
I also think that I could not to make the graph by only using ggplot2 features.
I would use grid.text and grid.gedit.
require(ggplot2)
Treatment <- rep(c('T','C'), each=2)
Gender <- rep(c('Male','Female'), 2)
Response <- sample(1:100, 4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels <- paste(xbreaks,'\n',c('',''))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment,
stat="identity"))
hist + geom_bar(position = "dodge") +
scale_y_continuous(limits = c(0, 100), name = "") +
scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(axis.text.x = theme_text(face='bold', size=12))
grid.text(label="POP1", x = 0.29, y = 0.06)
grid.text(label="POP2", x = 0.645, y = 0.06)
grid.gedit("GRID.text", gp=gpar(fontsize=8))
Please try to tune a code upon according to your environment (e.g. the position of sub-axis labels and the fontsize).
I found another simple solution below:
require(ggplot2)
Treatment <- rep(c('T','C'),each=2)
Gender <- rep(c('Male','Female'),2)
Response <- sample(1:100,4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels[1] <- expression(atop(bold(Female), scriptstyle("POP1")))
xlabels[2] <- expression(atop(bold(Male), scriptstyle("POP2")))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment,
stat="identity"))
hist +
geom_bar(position = "dodge") +
scale_y_continuous(limits = c(0, 100), name = "") +
scale_x_discrete(label = xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(size = 12)
)
All,
Using Triad's cheat this is the closest I was able to get to solution on this one. Let me know if you have any questions:
library(ggplot2)
spacing <- 0 #We can adjust how much blank space we have beneath the chart here
labels1= paste('Group',c('A','B','C','D'))
labels2 = rep(paste(rep('\n',spacing),collapse=''),length(labels1))
labels <- paste(labels1,labels2)
qplot(1:4,1:4, geom="blank") +
scale_x_continuous(breaks=1:length(labels), labels=labels) + xlab("")+
opts(plot.margin = unit(c(1, 1, 3, 0.5), "lines"),
axis.text.x = theme_text(face='bold', size=14))
xseq <- seq(0.15,0.9,length.out=length(labels)) #Assume for now that 0.15 and 0.9 are constant plot boundaries
sample_df <- data.frame(group=rep(labels1,each=2),subgroup=rep(c('a','b'),4),pop=sample(1:10,8))
popLabs <- by(sample_df,sample_df$group,function(subData){
paste(paste(subData$subgroup,' [n = ', subData$pop,']',sep=''),collapse='\n')
})
gridText <- paste("grid.text(label='\n",popLabs,"',x=",xseq,',y=0.1)',sep='')
sapply(gridText, function(x){ #Evaluate parsed character string for each element of gridText
eval(parse(text=x))
})
grid.gedit("GRID.text", gp=gpar(fontsize=12))
Cheers,
Aaron

Gap Y axis in ggplot

I have the below plot of ggplot with most Y values between 0-200, and one value ~3000:
I want to "zoom" on most of the values, but still show the high value
I wrote the following code:
Figure_2 <- ggplot(data = count_df, aes(x=count_df$`ng`,
y=count_df$`Number`)) +
geom_point(col = "darkmagenta") + ggtitle("start VS Number") +
xlab(expression(paste("start " , mu, "l"))) + ylab("Number") +
theme(plot.title = element_text(hjust = 0.5, color="orange", size=14,
face="bold.italic"),
axis.title.x = element_text(color="#993333", size=10, face = "bold"),
axis.title.y = element_text(color="#993333", size=10,face = "bold"))
Anybody knows how to achieve that?
A possible solution could be found by help of facet_grid. I do not have the exact data from OP but the approach should be to think of grouping y-axis in ranges. The OP has mentioned about two ranges as 0 - 200 and ~3000 for value of Number.
Hence, we have an option to divide Number by 2000 to transform it into factors representing 2 groups. That means factor(ceiling(Number/2000)) will create two factors.
Let's take similar data as OP and try our approach:
# Data
count_df <- data.frame(ng = 1:30, Number = sample(200:220, 30, TRUE))
# Change one value high as 3000
count_df$Number[20] <- 3000
library(ggplot2)
ggplot(data = count_df, aes(x=ng, y=Number)) +
geom_point() +
facet_grid(factor(ceiling(Number/2000))~., scales = "free_y") +
ggtitle("start VS Number") +
xlab(expression(paste("start " , mu, "l")))

Resources