How to hide (or remove) dots in the boxplot graph?

How to hide (or remove) dots in the boxplot graph? - r

I have a question about how to hide(or remove) dots in the boxplot graph.
This is code what I implemented.
install.packages("randomForestSRC")
install.packages("ggRandomForests")
library(randomForestSRC)
library(ggRandomForests)
data(pbc, package="randomForestSRC")
pbc.na <- na.omit(pbc)
set.seed(123)
rsf <- rfsrc(Surv(days,status)~., data=pbc.na, ntree=500, importance=T)
gg_v <- gg_variable(rsf, time = c(2000, 4000),
time.labels = c("2000 days", "4000 days"))
gg_v$stage <- as.factor(gg_v$stage)
plot(gg_v, xvar="stage", panel=T, points=F)+
ggplot2::theme_bw() +
ggplot2::geom_boxplot(outlier.shape=NA)+
ggplot2::labs(y="Survival (%)")+
ggplot2::coord_cartesian(ylim=c(-.01, 1.02))
So I would like to hide(or remove) all of the event's dots (both of False and True).
However, I have no information about what I want.
Please let me know how to do it.
Thanks always.

I am not familiar how ggRandomForests work. But using the data frame gg_v, we can directly do the plotting in ggplot2.
ggplot(gg_v, aes(stage, yhat, group = stage)) +
geom_boxplot(outlier.shape = NA) +
facet_wrap(~time, nrow = 2, strip.position = "right") +
ylab("Survival (%)") +
theme_bw()

You can also use the function "geom_boxplot2" from github ("Ipaper")
# devtools::install_github('kongdd/Ipaper')
library(Ipaper)
library(ggplot2)
ggplot(gg_v, aes(stage, yhat, group = stage)) +
geom_boxplot2(width = 0.8, width.errorbar = 0.5)+
facet_wrap(~time, nrow = 2, strip.position = "right") +
ylab("Survival (%)") +
theme_bw()

Related

How can I get the real scale from a facet_grid plot in R?

I am trying to add captions as it appears in this post.
For that reason, I need the real scale of the plot (x and y axis) when I am using facet_grid. I know that I can use layer_data, since it saves everything from the plot... However, it is not really accurate, because when I try to establish the limits using min and max from that output, the plot changes.
Here you have an example:
library(ggplot2)
library(dplyr)
val1 <- c(2.1490626,2.2035281,1.5927854,3.1399245,2.3967338,3.7915825,4.6691277,3.0727319,2.9230937,2.6239759,3.7664386,4.0160378,1.2500835,4.7648343,0.0000000,5.6740227,2.7510256,3.0709322,2.7998003,4.0809085,2.5178086,5.9713330,2.7779843,3.6724801,4.2648527,3.6841084,2.5597235,3.8477471,2.6587736,2.2742209,4.5862788,6.1989269,4.1167091,3.1769325,4.2404515,5.3627032,4.1576810,4.3387921,1.4024381,0.0000000,4.3999099,3.4381837,4.8269218,2.6308474,5.3481382,4.9549753,4.5389650,1.3002293,2.8648220,2.4015338,2.0962332,2.6774765,3.0581759,2.5786137,5.0539080,3.8545796,4.3429043,4.2233248,2.0434363,4.5980727)
val2 <- c(3.7691229,3.6478055,0.5435826,1.9665861,3.0802654,1.2248374,1.7311236,2.2492826,2.2365337,1.5726119,2.0147144,2.3550348,1.9527204,3.3689502,1.7847986,3.5901329,1.6833872,3.4240479,1.8372175,0.0000000,2.5701453,3.6551315,4.0327091,3.8781182)
df1 <- data.frame(value = val1)
df2 <- data.frame(value = val2)
data <- bind_rows(lst(df1, df2), .id = 'id')
data$Sex <- rep(c("Male", "Female"), times=84/2)
p <- data %>%
ggplot(aes(value)) +
geom_density(lwd = 1.2, colour="red", show.legend = FALSE) +
geom_histogram(aes(y=..density.., fill = id), bins=10, col="black", alpha=0.2) +
facet_grid(id ~ Sex ) +
xlab("type_data") +
ylab("Density") +
ggtitle("title") +
guides(fill=guide_legend(title="legend_title")) +
theme(strip.text.y = element_blank())
p
plot_info <- layer_data(p)
> min(plot_info$density)
[1] 7.166349e-09
> max(plot_info$density)
[1] 0.5738021
As you can see in the plot, the y-axis starts at 0 and if finishes around 0.7 more less. However, the maximum density is 0.57.
If I try to use the info from layer_data:
p + coord_cartesian(clip="off", ylim=c(min(plot_info$density), max(plot_info$density)),
xlim = c(min(plot_info$x), max(plot_info$x)))
The plot changes completely.
Does anyone know how can I get the scales that ggplot2 and facet_grid are using? I need the information of the density (y_axis) and the info from the x_axis.

Yes, to get the scales directly, use layer_scales(p), which gives you the range of the axes rather than just the range of the data, which is what you get from layer_data(p)
p + coord_cartesian(clip = "off",
ylim = layer_scales(p)$y$range$range,
xlim = layer_scales(p)$x$range$range)
Or, to combine this question with your last, where you add the text labels outside of the plotting panels, your result might be something like:
p + coord_cartesian(clip = "off",
ylim = layer_scales(p)$y$range$range,
xlim = layer_scales(p)$x$range$range) +
geom_text(data = data.frame(value = c(0, 6), id = c("df2", "df2"),
Sex = c('Female', 'Male')),
aes(y = -0.15, label = c('Female', 'Male')))

Does this help?
?layer_data
summary(layer_data(p, i = 2))
i is the layer you want to return
Can min the xmin and max the xmax etc

How to add captions outside the plot on individual facets in ggplot2?

I am trying to add a caption in each facet (I am using facet_grid). I have seen these approach and this one: but nothing gives me what I need. Also, the first approach returns a warning message that I didn't find any solution:
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2.
My example:
library(ggplot2)
library(datasets)
mydf <- CO2
a <- ggplot(data = mydf, aes(x = conc)) + geom_histogram(bins = 15, alpha = 0.75) +
labs(y = "Frequency") + facet_grid(Type ~ Treatment)
a
caption_df <- data.frame(
cyl = c(4,6),
txt = c("1st=4", "2nd=6")
)
a + coord_cartesian(clip="off", ylim=c(0, 3)) +
geom_text(
data=caption_df, y=1, x=100,
mapping=aes(label=txt), hjust=0,
fontface="italic", color="red"
) +
theme(plot.margin = margin(b=25))
The idea is to have 1 caption per plot, but with this approach it repeats the caption and it is overwritten.
Is it possible to have something like this? (caption OUTSIDE the plot) (but without the previous warning)
a + labs(caption = c("nonchilled=4", "chilled=6")) + theme(plot.caption = element_text(hjust=c(0, 1)))
NOTE: This is only an example, but I may need to put long captions (sentences) for each plot.
Example:
a + labs(caption = c("This is my first caption that maybe it will be large. Color red, n= 123", "This is my second caption that maybe it will be large. Color blue, n= 22")) +
theme(plot.caption = element_text(hjust=c(1, 0)))
Does anyone know how to do it?
Thanks in advance

You need to add the same faceting variable to your additional caption data frame as are present in your main data frame to specify the facets in which each should be placed. If you want some facets unlabelled, simply have an empty string.
caption_df <- data.frame(
cyl = c(4, 6, 8, 10),
conc = c(0, 1000, 0, 1000),
Freq = -1,
txt = c("1st=4", "2nd=6", '', ''),
Type = rep(c('Quebec', 'Mississippi'), each = 2),
Treatment = rep(c('chilled', 'nonchilled'), 2)
)
a + coord_cartesian(clip="off", ylim=c(0, 3), xlim = c(0, 1000)) +
geom_text(data = caption_df, aes(y = Freq, label = txt)) +
theme(plot.margin = margin(b=25))

Circular tree with heatmap

This question is quite trivial but I cannot be handled nicely with.
I'm trying to plot a circular tree with a side heatmap.
I'm using ggtree but any approach ggplo2 based is welcome.
The problems that I'm not understanding well the gheatmap function.
I want:
1- names AFTER the heatmap
2- 2 text columns after heatmap (for while may have the same value, but I need to know how to add it )
3- heatmap columns name nicely handled, should we remove the columns name and use different colors scales for each? wherever the solution falls might better than the way it is now
library(tidyverse)
library(ggtree)
library(treeio)
library(tidytree)
beast_file <- system.file("examples/MCC_FluA_H3.tree", package="ggtree")
beast_tree <- read.beast(beast_file)
genotype_file <- system.file("examples/Genotype.txt", package="ggtree")
genotype <- read.table(genotype_file, sep="\t", stringsAsFactor=F)
colnames(genotype) <- sub("\\.$", "", colnames(genotype))
p <- ggtree(beast_tree, mrsd="2013-01-01",layout = "fan", open.angle = -270) +
geom_treescale(x=2008, y=1, offset=2) +
geom_tiplab(size=2)
gheatmap(p, genotype, offset=5, width=0.5, font.size=3,
colnames_angle=-45, hjust=0) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype")
Thanks in advance
UPDATE:
I found a better way to plot the name of heatmap columns.
Also, I found that the simplification of the data was useful to
clean up a little the tip labels.
Now, I just need to add two text columns after heatmap.
p <- ggtree(beast_tree)
gheatmap(
p, genotype, colnames=TRUE,
colnames_angle=90,
colnames_offset_y = 5,
colnames_position = "top",
) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype")
UPDATE 2:
A very bad improvement
I just used ggplot to create the label and merge with patchwork
library(patchwork)
p$data %>%
ggplot(aes(1, y= y, label = label )) +
geom_text(size=2) +
xlim(NA, 1) +
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()) -> adText
pp + adText

The answer according #xiangpin at GitHub.
Big offset value to geom_tiplabel:
p <- ggtree(beast_tree)
p1 <- gheatmap(
p, genotype, colnames=TRUE,
colnames_angle=-45,
colnames_offset_y = 5,
colnames_position = "bottom",
width=0.3,
hjust=0, font.size=2) +
scale_fill_manual(breaks=c("HuH3N2", "pdm", "trig"),
values=c("steelblue", "firebrick", "darkgreen"), name="genotype") +
geom_tiplab(align = TRUE, linesize=0, offset = 7, size=2) +
xlim_tree(xlim=c(0, 36)) +
scale_y_continuous(limits = c(-1, NA))
p1
Using ggtreeExtra:
library(ggtreeExtra)
library(ggtree)
library(treeio)
library(ggplot2)
beast_file <- system.file("examples/MCC_FluA_H3.tree", package="ggtree")
genotype_file <- system.file("examples/Genotype.txt", package="ggtree")
tree <- read.beast(beast_file)
genotype <- read.table(genotype_file, sep="\t")
colnames(genotype) <- sub("\\.$", "", colnames(genotype))
genotype$ID <- row.names(genotype)
dat <- reshape2::melt(genotype, id.vars="ID", variable.name = "type", value.name="genotype", factorsAsStrings=FALSE)
dat$genotype <- unlist(lapply(as.vector(dat$genotype),function(x)ifelse(nchar(x)==0,NA,x)))
p <- ggtree(tree) + geom_treescale()
p2 <- p + geom_fruit(data=dat,
geom=geom_tile,
mapping=aes(y=ID, x=type, fill=genotype),
color="white") +
scale_fill_manual(values=c("steelblue", "firebrick", "darkgreen"),
na.translate=FALSE) +
geom_axis_text(angle=-45, hjust=0, size=1.5) +
geom_tiplab(align = TRUE, linesize=0, offset = 6, size=2) +
xlim_tree(xlim=c(0, 36)) +
scale_y_continuous(limits = c(-1, NA))
p2

Set the Axis values (in an animation)

How do I stop the Y-axis changing during an animation?
The graph I made is at http://i.imgur.com/EKx6Tw8.gif
The idea is to make an animated heatmap of population and income each year. The problem is the y axis jumps to include 0 or not include the highest value sometime. How do you solidly set the axis values? I know this must be a common issue but I can't find the answer
The code to recreate it is
library(gapminder)
library(ggplot2)
library(devtools)
install_github("dgrtwo/gganimate")
library(gganimate)
library(dplyr)
mydata <- dplyr::select(gapminder, country,continent,year,lifeExp,pop,gdpPercap)
#bin years into 5 year bins
mydata$lifeExp2 <- as.integer(round((mydata$lifeExp-2)/5)*5)
mydata$income <- cut(mydata$gdpPercap, breaks=c(0,250,500,750,1000,1500,2000,2500,3000,3500,4500,5500,6500,7500,9000,11000,21000,31000,41000, 191000),
labels=c(0,250,500,750,1000,1500,2000,2500,3000,3500,4500,5500,6500,7500,9000,11000,21000,31000,41000))
sizePer <- mydata%>%
group_by(lifeExp2, income, year)%>%
mutate(popLikeThis = sum(pop))%>%
group_by(year)%>%
mutate(totalPop = sum(as.numeric(pop)))%>%
mutate(per = (popLikeThis/totalPop)*100)
sizePer$percent <- cut(sizePer$per, breaks=c(0,.1,.3,1,2,3,5,10,20,Inf),
labels=c(0,.1,.3,1,2.0,3,5,10,20))
saveGIF({
for(i in c(1997,2002,2007)){
print(ggplot(sizePer %>% filter(year == i),
aes(x = lifeExp2, y = income)) +
geom_tile(aes(fill = percent)) +
theme_bw()+
theme(legend.position="top", plot.title = element_text(size=30, face="bold",hjust = 0.5))+
coord_cartesian(xlim = c(20,85), ylim = c(0,21)) +
scale_fill_manual("%",values = c("#ffffcc","#ffeda0","#fed976","#feb24c","#fd8d3c","#fc4e2a","#e31a1c","#bd0026","#800026"),drop=FALSE)+
annotate(x=80, y=3, geom="text", label=i, size = 6) +
annotate(x=80, y=1, geom="text", label="#iamreddave", size = 5) +
ylab("Income") + # Remove x-axis label
xlab("Life Expenctancy")+
ggtitle("Worldwide Life Expectancy and Income")
)}
}, interval=0.7,ani.width = 900, ani.height = 600)

Solution:
Adding scale_y_discrete(drop = F) to the ggplot call. Answered by #bdemarest in comments.

Different font faces and sizes within label text entries in ggplot2

I am building charts that have two lines in the axis text. The first line contains the group name, the second line contains that group population. I build my axis labels as a single character string with the format "LINE1 \n LINE2". Is it possible to assign different font faces and sizes to LINE1 and LINE2, even though they are contained within a single character string? I would like LINE1 to be large and bolded, and LINE2 to be small and unbolded.
Here's some sample code:
Treatment <- rep(c('T','C'),each=2)
Gender <- rep(c('Male','Female'),2)
Response <- sample(1:100,4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels <- paste(xbreaks,'\n',c('POP1','POP2'))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment, stat="identity"))
hist + geom_bar(position = "dodge") + scale_y_continuous(limits = c(0,
100), name = "") + scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(face='bold',size=12)
)
I tried this, but the result was one large, bolded entry, and one small, unbolded entry:
hist + geom_bar(position = "dodge") + scale_y_continuous(limits = c(0,
100), name = "") + scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(face=c('bold','plain'),size=c('15','10'))
)
Another possible solution is to create separate chart elements, but I don't think that ggplot2 has a 'sub-axis label' element available...
Any help would be very much appreciated.
Cheers,
Aaron

I also think that I could not to make the graph by only using ggplot2 features.
I would use grid.text and grid.gedit.
require(ggplot2)
Treatment <- rep(c('T','C'), each=2)
Gender <- rep(c('Male','Female'), 2)
Response <- sample(1:100, 4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels <- paste(xbreaks,'\n',c('',''))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment,
stat="identity"))
hist + geom_bar(position = "dodge") +
scale_y_continuous(limits = c(0, 100), name = "") +
scale_x_discrete(labels=xlabels, breaks = xbreaks) +
opts(axis.text.x = theme_text(face='bold', size=12))
grid.text(label="POP1", x = 0.29, y = 0.06)
grid.text(label="POP2", x = 0.645, y = 0.06)
grid.gedit("GRID.text", gp=gpar(fontsize=8))
Please try to tune a code upon according to your environment (e.g. the position of sub-axis labels and the fontsize).

I found another simple solution below:
require(ggplot2)
Treatment <- rep(c('T','C'),each=2)
Gender <- rep(c('Male','Female'),2)
Response <- sample(1:100,4)
test_df <- data.frame(Treatment, Gender, Response)
xbreaks <- levels(test_df$Gender)
xlabels[1] <- expression(atop(bold(Female), scriptstyle("POP1")))
xlabels[2] <- expression(atop(bold(Male), scriptstyle("POP2")))
hist <- ggplot(test_df, aes(x=Gender, y=Response, fill=Treatment,
stat="identity"))
hist +
geom_bar(position = "dodge") +
scale_y_continuous(limits = c(0, 100), name = "") +
scale_x_discrete(label = xlabels, breaks = xbreaks) +
opts(
axis.text.x = theme_text(size = 12)
)

All,
Using Triad's cheat this is the closest I was able to get to solution on this one. Let me know if you have any questions:
library(ggplot2)
spacing <- 0 #We can adjust how much blank space we have beneath the chart here
labels1= paste('Group',c('A','B','C','D'))
labels2 = rep(paste(rep('\n',spacing),collapse=''),length(labels1))
labels <- paste(labels1,labels2)
qplot(1:4,1:4, geom="blank") +
scale_x_continuous(breaks=1:length(labels), labels=labels) + xlab("")+
opts(plot.margin = unit(c(1, 1, 3, 0.5), "lines"),
axis.text.x = theme_text(face='bold', size=14))
xseq <- seq(0.15,0.9,length.out=length(labels)) #Assume for now that 0.15 and 0.9 are constant plot boundaries
sample_df <- data.frame(group=rep(labels1,each=2),subgroup=rep(c('a','b'),4),pop=sample(1:10,8))
popLabs <- by(sample_df,sample_df$group,function(subData){
paste(paste(subData$subgroup,' [n = ', subData$pop,']',sep=''),collapse='\n')
})
gridText <- paste("grid.text(label='\n",popLabs,"',x=",xseq,',y=0.1)',sep='')
sapply(gridText, function(x){ #Evaluate parsed character string for each element of gridText
eval(parse(text=x))
})
grid.gedit("GRID.text", gp=gpar(fontsize=12))
Cheers,
Aaron

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to hide (or remove) dots in the boxplot graph? - r

I am not familiar how ggRandomForests work. But using the data frame gg_v, we can directly do the plotting in ggplot2. ggplot(gg_v, aes(stage, yhat, group = stage)) + geom_boxplot(outlier.shape = NA) + facet_wrap(~time, nrow = 2, strip.position = "right") + ylab("Survival (%)") + theme_bw()

Related

How can I get the real scale from a facet_grid plot in R?

How to add captions outside the plot on individual facets in ggplot2?

Circular tree with heatmap

Set the Axis values (in an animation)

Different font faces and sizes within label text entries in ggplot2

Categories

Resources