How to overlay two geom_bar? - r

I'm trying to overlay 2 the bars from geom_bar derived from 2 separate data.frames.
dEQ
lab perc
1 lmP 55.9
2 lmN 21.8
3 Nt 0.6
4 expG 5.6
5 expD 0.0
6 prbN 11.2
7 prbP 5.0
and
LMD
lab perc
1 lmP 16.8
2 lmN 8.9
3 Nt 0.0
4 expG 0.0
5 expD 0.0
6 prbN 0.0
7 prbP 0.0
The first plot is:
p <- ggplot(dEQ, aes(lab, perc)) +
xlab(xlabel) + ylab(ylabel) +
geom_bar(stat="identity", colour="blue", fill="darkblue") +
geom_text(aes(vecX, vecYEQ+1.5, label=vecYlbEQ), data=dEQ, size=8.5) +
theme_bw() +
opts(axis.text.x = theme_text(size = 20, face = "bold", colour = "black")) +
opts(axis.text.y = theme_text(size = 20, face = "bold", colour = "black")) +
coord_flip() +
scale_y_continuous(breaks=c(0,10,20,30,40,50,60),
labels=c("0","","20","","40","","60"),
limits = c(0, 64), expand = c(0,0))
print(p)
but I want to overplot with another geom_bar from data.frame LMD
ggplot(LMD, aes(lab, perc)) +
geom_bar(stat="identity", colour="blue", fill="red", add=T)
and I want to have a legend.

here is an example:
p <- ggplot(NULL, aes(lab, perc)) +
geom_bar(aes(fill = "dEQ"), data = dEQ, alpha = 0.5) +
geom_bar(aes(fill = "LMD"), data = LMD, alpha = 0.5)
p
but I recommend to rbind them and plot it by dodging:
dEQ$name <- "dEQ"
LMD$name <- "LMD"
d <- rbind(dEQ, LMD)
p <- ggplot(d, aes(lab, perc, fill = name)) + geom_bar(position = "dodge")

Though the answer is not directly the requirement of OP, but as this question is linked to many subsequent questions on SO that have been closed by giving the link of this question, I am proposing a method for bar(s) within bar plot construction method in ggplot2.
Example for two bars (group-wise division) within one bigger bar plot.
library(tidyverse)
set.seed(40)
df <- data_frame(name = LETTERS[1:10], provision = rnorm(mean = 100, sd = 20, n = 10),
expenditure = provision - rnorm(mean = 25, sd = 10, n = 10))
df %>% mutate(savings = provision - expenditure) %>%
pivot_longer(cols = c("expenditure", "savings"), names_to = "Exp", values_to = "val") %>%
ggplot() + geom_bar(aes(x= name, y = provision/2), stat = "identity", fill = "blue", width = 0.9, alpha = 0.3) +
geom_col(aes(x=name,y=val, fill = Exp), position ="dodge", width = 0.7) +
scale_y_continuous(name = "Amount in \u20b9")

Another option to overlay your bars without lowering transparency using alpha is to group_by the data based on your fill variable and arrange(desc()) your y variable, using position = position_identity() to overlay your bars and have the highest value bars behind and lower values in front. Then you don't need to change the transparency. Here is a reproducible example:
# Add name for fill aesthetic
dEQ$name <- "dEQ"
LMD$name <- "LMD"
library(dplyr)
library(ggplot2)
dEQ %>%
rbind(LMD) %>%
group_by(name) %>%
arrange(desc(perc)) %>%
ggplot(aes(x = lab, y = perc, fill = name)) +
geom_bar(stat="identity", position = position_identity())
Created on 2022-11-02 with reprex v2.0.2
As you can see the bars overlay while keeping the origin transparency.

Related

How to customize Horizontal dots plot?

I want to plot customized Horizontal dots using my data and the code given here
data:
df <- data.frame (origin = c("A","B","C","D","E","F","G","H","I","J"),
Percentage = c(23,16,32,71,3,60,15,21,44,60),
rate = c(10,12,20,200,-25,12,13,90,-105,23),
change = c(10,12,-5,12,6,8,0.5,-2,5,-2))
.
origin Percentage rate change
1 A 23 10 10.0
2 B 16 12 12.0
3 C 32 20 -5.0
4 D 71 200 12.0
5 E 3 -25 6.0
6 F 60 12 8.0
7 G 15 13 0.5
8 H 21 90 -2.0
9 I 44 -105 5.0
10 J 60 23 -2.0
obs from 'origin' column need be put on y-axis. corresponding values in 'change' and 'rate' column must be presented/differentiated through in box instead of circles, for example values from 'change' column in lightblue and values from 'rate' column in blue. In addition I want to add second vertical axis on right and put circles on it which size will be defined based on corresponding value in 'Percentage' column.
Output of code from the link:
Expected outcome (smth. like this:
Try this.
First, reshaping so that both rate and change are in one column better supports ggplot's general preference towards "long" data.
df2 <- reshape2::melt(df, id.vars = c("origin", "Percentage"))
(That can also be done using pivot_wider.)
The plot:
ggplot(df2, aes(value, origin)) +
geom_label(aes(label = value, fill = variable, color = variable)) +
geom_point(aes(size = Percentage), x = max(df2$value) +
20, shape = 21) +
scale_x_continuous(expand = expansion(add = c(15, 25))) +
scale_fill_manual(values = c(change="lightblue", rate="blue")) +
scale_color_manual(values = c(change="black", rate="white")) +
theme_bw() +
theme(panel.border = element_blank(), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank()) +
labs(x = NULL, y = NULL)
The legend and labels can be adjusted in the usual ggplot methods. Overlapping of labels is an issue with which you will need to contend.
Update on OP request: See comments:
gg_dot +
geom_text(aes(x = rate, y = origin,
label = paste0(round(rate, 1), "%")),
col = "black") +
geom_text(aes(x = change, y = origin,
label = paste0(round(change, 1), "%")),
col = "white") +
geom_text(aes(x = x, y = y, label = label, col = label),
data.frame(x = c(40 - 1.1, 180 + 0.6), y = 11,
label = c("change", "rate")), size = 6) +
scale_color_manual(values = c("#9DBEBB", "#468189"), guide = "none") +
scale_y_discrete(expand = c(0.2, 0))
First answer:
Something like this?
library(tidyverse)
library(dslabs)
gg_dot <- df %>%
arrange(rate) %>%
mutate(origin = fct_inorder(origin)) %>%
ggplot() +
# remove axes and superfluous grids
theme_classic() +
theme(axis.title = element_blank(),
axis.ticks.y = element_blank(),
axis.line = element_blank()) +
# add a dummy point for scaling purposes
geom_point(aes(x = 12, y = origin),
size = 0, col = "white") +
# add the horizontal discipline lines
geom_hline(yintercept = 1:10, col = "grey80") +
# add a point for each male success rate
geom_point(aes(x = rate, y = origin),
size = 11, col = "#9DBEBB") +
# add a point for each female success rate
geom_point(aes(x = change, y = origin),
size = 11, col = "#468189")
gg_dot +
geom_text(aes(x = rate, y = origin,
label = paste0(round(rate, 1))),
col = "black") +
geom_text(aes(x = change, y = origin,
label = paste0(round(change, 1))),
col = "white") +
geom_text(aes(x = x, y = y, label = label, col = label),
data.frame(x = c(40 - 1.1, 180 + 0.6), y = 11,
label = c("change", "rate")), size = 6) +
scale_color_manual(values = c("#9DBEBB", "#468189"), guide = "none") +
scale_y_discrete(expand = c(0.2, 0))

Set size line plot with different y axis as addition to a stacked barplot

I would like to plot stacked barplot with added line plot that presents the overall set sizes. I'm plotting stacked barplot in ggplot2 without problems however additional line with different y axis is the difficulty. I'm using long-formated table as input, so there is no 'overall size' column.
Code to reproduce sample table:
df <- data.frame(Sample=c("S1","S2","S3","S4","S5","S6"), A=c(30,52,50,81,23,48), B=c(12,20,15,22,30,14), C=c(rep(15,6)))
df.melt <- melt(setDT(df), id.vars = "Sample", variable.name = "Group")
Head of the table:
Sample Group value
1: S1 A 30
2: S2 A 52
3: S3 A 50
4: S4 A 81
5: S5 A 23
6: S6 A 48
Code to draw stacked barplot:
ggplot(df.melt, aes(x = Sample, y = value, fill = Group)) +
geom_col(position = position_fill(reverse = TRUE)) +
theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank()) +
scale_fill_brewer(palette="Set3") +
ylab("% of Total") +
scale_y_continuous(labels = percent) +
scale_x_discrete(limits = unique(df.melt$Sample))
Therefore the line would run through six stacked bars pointing the size of each set i.e. for sample S1 it would be 57 (A + B + C), and y axis labels to the right of the plot would show set size range.
You can put the data set directly in the geom. This allows you to use different data sets for each geom. Secondary axis are a bit tricky. They need to be a function of the primary axis and the data adjusted accordingly. I've used 120 as the adjustment factor.
percent <- c("0%", "25%", "50%", "75%", "100%")
set_sizes <- df %>%
rowwise %>%
mutate(Size = sum(A, B, C))
ggplot() +
geom_col(df.melt, mapping = aes(x = Sample, y = value, fill = Group),position = position_fill(reverse = TRUE)) +
geom_line(set_sizes, mapping = aes(x = Sample, y = Size / 120, group = 1)) +
scale_y_continuous(name = "% of Total", labels = percent, sec.axis = sec_axis(~ .*120, name = "Sample Size")) +
theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank()) +
scale_fill_brewer(palette="Set3") +
scale_x_discrete(limits = unique(df.melt$Sample))
Alternatively, you can use cowplot to arrange two independent plots on top of each other, e.g.:
suppressMessages(invisible(lapply(c("data.table", "ggplot2", "cowplot"),
require, character.only=TRUE)))
df <- data.table(Sample=c("S1","S2","S3","S4","S5","S6"),
A=c(30,52,50,81,23,48), B=c(12,20,15,22,30,14), C=c(rep(15,6)))
df.melt <- melt(df, id.vars = "Sample", variable.name = "Group")
percent <- paste0(sprintf("%s", seq(0, 100, 25)), "%")
p1 <- ggplot(df.melt, aes(x = Sample, y = value, fill = Group)) +
geom_col(position = position_fill(reverse = TRUE)) +
theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank()) +
scale_fill_brewer(palette="Set3") +
ylab("% of Total") +
scale_y_continuous(labels = percent) +
scale_x_discrete(limits = unique(df.melt$Sample))
p2 <- ggplot(df.melt[, .(value=sum(value)), by="Sample"],
aes(x = Sample, y = value, group=1)) +
geom_line() +
scale_x_discrete(labels = NULL, breaks = NULL) +
labs(x = NULL)
plot_grid(p2, NULL, p1, align="hv", nrow=3, axis='tlbr', rel_heights=c(1, -.28, 4), greedy=FALSE)
Created on 2022-02-20 by the reprex package (v2.0.1)

Log10 Y-Axis starting from 0

I created a bar plot to show differences in water accumulation on different sites and layers. Because one value is way higher than the other ones I want to set the y-axis on log10 scale. It all works but the result looks rather unintuitive. is it possible to set the limit of the y-axis to 0 so the bar with the value 0.2 is not going downwards?
Here is the code I used:
p2 <- ggplot(data_summary2, aes(x= Site, y= small_mean, fill= Depth, Color= Depth))+
geom_bar(stat = "identity", position = "dodge", alpha=1)+
geom_errorbar(aes(ymin= small_mean - sd, ymax= small_mean + sd),
position = position_dodge(0.9),width=0.25, alpha= 0.6)+
scale_fill_brewer(palette = "Greens")+
geom_text(aes(label=small_mean),position=position_dodge(width=0.9), vjust=-0.25, hjust= -0.1, size= 3)+
#geom_text(aes(label= Tukey), position= position_dodge(0.9), size=3, vjust=-0.8, hjust= -0.5, color= "gray25")+
theme_bw()+
theme(legend.position = c(0.2, 0.9),legend.direction = "horizontal")+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(fill="Depth [cm]")+
theme(axis.text.x = element_text(angle = 25, size =9, vjust = 1, hjust=1))+
scale_x_discrete(labels= c( "Site 1\n(Hibiscus tillaceus)","Site 2 \n(Ceiba pentandra)","Site 3 \n(Clitoria fairchildiana)","Site 4 \n(Pachira aquatica)"))+
#theme(legend.position = c(0.85, 0.7))+
labs(x= "Sites\n(Type of Tree)", y= "µg Deep-Water/ g rhizosphere soil", title = "Average microbial Deep-water incorporation per Site", subtitle = "Changes over Time and Depth")+
facet_grid(.~Time, scale = "free")
p2 + scale_y_continuous(trans = "log10")
This is what the plot looks like:
On a log scale there is no 0, therefore the only sensible place for bars to start from is y = 10^0 or 1.
However you can create a pseudolog scale using scales::pseudo_log_trans to get 0 included on the axis so all the bars go the same direciton. I'm borrowing from this answer. NOTE it's important to add 0 to the breaks to make it clear that this is a pseudolog scale. Compare the two plots below:
library(tidyverse)
library(scales)
# make up data with distribution positive values above and below 1
d <- tibble(grp = LETTERS[1:5],
val = 10^(-2:2))
# normal plot with true log scale doesn't contain 0
d %>%
ggplot(aes(x = grp, y = val, fill = grp)) +
geom_col() +
ggtitle("On True Log Scale Bars Start at y = 1") +
scale_y_log10() # or if you prefer: scale_y_continuous(trans = "log10")
# set range of 'linear' portion of pseudolog scale
sigma <- min(d$val)
# plot on pseudolog to get all bars to extend to 0
d %>%
ggplot(aes(x = grp, y = val, fill = grp)) +
geom_col() +
ggtitle("On Pseudolog Scale Bars Start at y = 0") +
scale_y_continuous(
trans = pseudo_log_trans(base = 10, sigma = sigma),
breaks = c(0, 10^(-2:2)),
labels = label_number(accuracy = 0.01)
)
Created on 2021-12-30 by the reprex package (v2.0.1)

Control jitter width to allow overlapping of categories ggplot2

I have a dataset looking like this:
Flowers Sun_Exposition Value Repl
1: Tulipe mid 87.9 Aa1
2: Tulipe mid 92.8 Aa2
3: Tulipe mid 86.4 Aa3
4: Tulipe mid 83.3 Aa4
5: Tulipe mid 91.3 Aa5-1
6: Tulipe mid 91.4 Aa5-2
Flowers having two categories and there is 4 different Sun exposition. For each combination I have a different number of replicates.
I would like to plot a barplot (with sd) with also the points shaped by replicates.
Here is my code:
# summarize data
dataSum <- data[, .(M = mean(Value, na.rm = T), S = sd(Value, na.rm = T)),
by = .(Flowers, Sun_Exposition)]
# the plot
p <- ggplot(dataSum, aes(x = Sun_Exposition, y = M, fill = Flowers)) +
geom_bar(stat = "identity", color = "black",
position = position_dodge(.9)) +
geom_errorbar(aes(ymin = M, ymax = M + S), width = .2,
position = position_dodge(.9)) +
geom_jitter(data = data,
mapping = aes(x = Sun_Exposition, y = Value,
fill = Flowers,
shape = Repl, color = Flowers),
size = 2,
position = position_dodge(width = 0.9)) +
scale_shape_manual(values = c(19,17,18,15,7,8,6,5,4,2,13,12,3))+
scale_fill_manual(values = c("gray", "lavender")) +
scale_color_manual(values = c("gray30", "mediumpurple"))
which gives me this plot:
My problem is that the "spread" of the points is not as width as the width of the bar. I have tried many combination, such as using geom_point, and position_jitterdodge, putting the jitter width to 0 or negative but it never gave the results I wanted.
Thank you very much for your help!
You have to add a group mapping to the aesthetics of geom_jitter. On your data, I suppose you would have to add group = Flowers.
Here is an example with mtcars:
library(magrittr)
library(dplyr)
library(ggplot2)
library(forcats)
mt1 <- mtcars %>%
group_by(cyl, am) %>%
summarize(mean = mean(hp, na.rm = T),
sd = sd(hp, na.rm = T))
mt2 <- mtcars %>%
mutate(rep = paste0("rep", rep(1:8, each = 4)))
# the plot
ggplot(mt1, aes(x = as_factor(cyl),
y = mean,
fill = as_factor(am))) +
geom_bar(stat = "identity", color = "black",
position = position_dodge(.9)) +
geom_errorbar(aes(ymin = mean, ymax = mean + sd), width = .2,
position = position_dodge(.9)) +
geom_jitter(data = mt2,
aes(x = as_factor(cyl),
y = hp,
fill = as_factor(am),
shape = as_factor(rep),
color = as_factor(am),
group = as_factor(am)),
size = 1,
position = position_jitterdodge(jitter.width = 0.9)) +
scale_shape_manual(values = c(19,17,18,15,7,8,6,5,4,2,13,12,3))+
scale_fill_manual(values = c("gray", "lavender")) +
scale_color_manual(values = c("red", "blue"))

Connecting points of a shifted plot with another plot using ggplot in R

I have two dataframes df1 and df2 as follows:
> df1
dateTime value
1 1 6
2 2 2
3 3 3
4 4 1
> df2
dateTime value
1 1 3
2 2 8
3 3 4
4 4 5
I want to plot these dataframes in just one diagram, split them to two different plots with same x axis, shift df1 by 1 to the right, and connect each value of df1 to the corresponding value of df2. Here is my code:
#Shift df1 by 1 to the right
df1$value <- lag(df1$value, 1)
plot1 <- df1 %>%
select(dateTime, value) %>%
ggplot(aes(dateTime, value)) +
geom_point() +
geom_line(color = "green") +
geom_segment(aes(xend = dateTime, yend = -Inf), linetype = "dashed") +
theme(axis.text=element_text(size = 14), axis.title=element_text(size = 14),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
plot2 <- df2 %>%
select(dateTime, value) %>%
ggplot(aes(dateTime, value)) +
geom_point() +
geom_line(color = "red") +
geom_segment(aes(xend = dateTime, yend = Inf), linetype = "dashed") +
xlab("dateTime") +
theme(axis.text=element_text(size = 14), axis.title=element_text(size = 14))
gt <- rbind(ggplotGrob(plot1), ggplotGrob(plot2), size = "last")
# Panel positioning
is_panel <- which(gt$layout$name == "panel")
panel_x <- unique(gt$layout$l[is_panel])
panel_y <- gt$layout$t[is_panel]
# Coordinates and graphical parameters for segments
x_coords <- gt$grobs[[is_panel[1]]]$children[[5]]$x0
gpar <- gt$grobs[[is_panel[1]]]$children[[5]]$gp
linkgrob <- segmentsGrob(x0 = x_coords, y0 = 0, x1 = x_coords, y1 = 1, gp = gpar)
gt <- gtable_add_grob(gt, linkgrob,
t = panel_y[1] + 1, l = panel_x, b = panel_y[2] - 1)
grid.newpage()
grid.draw(gt)
Here is the result, but actually there is an additional line which I want to remove it and also there is no point for the last value of df1 which I also want to show the last point:
Lag
I guess that lag is maybe the wrong function:
lag(1:3)
# [1] NA 1 2
If I understand you correctly, you want to shift your data and this depends on your real data, but for this dummy example something like
df1 <- df1 %>%
mutate(dateTime = dateTime + 1)
should do the trick.
Lines
You need to adapt your base plots a bit:
plot1 <- df1 %>%
select(dateTime, value) %>%
## create a temp variable to which we can map the line type to
mutate(lty = ifelse(dateTime == max(dateTime), "none", "dashed")) %>%
ggplot(aes(dateTime, value)) +
geom_point() +
geom_line(color = "green") +
## map the linetype to this variable
geom_segment(aes(xend = dateTime, yend = -Inf, linetype = lty)) +
## use a manual scale to map the variable to dashed and blank linetype
scale_linetype_manual(values = c(dashed = "dashed", none = "blank"),
guide = "none") +
## add xlim to align scales properly in both plots
xlim(c(1, 5)) +
theme(axis.text=element_text(size = 14), axis.title=element_text(size = 14),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
plot2 <- df2 %>%
select(dateTime, value) %>%
mutate(lty = ifelse(dateTime == min(dateTime), "none", "dashed")) %>%
ggplot(aes(dateTime, value)) +
geom_point() +
geom_line(color = "red") +
geom_segment(aes(xend = dateTime, yend = Inf, linetype = lty)) +
scale_linetype_manual(values = c(dashed = "dashed", none = "blank"),
guide = "none") +
xlab("dateTime") +
xlim(c(1, 5)) +
theme(axis.text=element_text(size = 14), axis.title=element_text(size = 14))
This gives you this plot:

Resources