I want to plot customized Horizontal dots using my data and the code given here
data:
df <- data.frame (origin = c("A","B","C","D","E","F","G","H","I","J"),
Percentage = c(23,16,32,71,3,60,15,21,44,60),
rate = c(10,12,20,200,-25,12,13,90,-105,23),
change = c(10,12,-5,12,6,8,0.5,-2,5,-2))
.
origin Percentage rate change
1 A 23 10 10.0
2 B 16 12 12.0
3 C 32 20 -5.0
4 D 71 200 12.0
5 E 3 -25 6.0
6 F 60 12 8.0
7 G 15 13 0.5
8 H 21 90 -2.0
9 I 44 -105 5.0
10 J 60 23 -2.0
obs from 'origin' column need be put on y-axis. corresponding values in 'change' and 'rate' column must be presented/differentiated through in box instead of circles, for example values from 'change' column in lightblue and values from 'rate' column in blue. In addition I want to add second vertical axis on right and put circles on it which size will be defined based on corresponding value in 'Percentage' column.
Output of code from the link:
Expected outcome (smth. like this:
Try this.
First, reshaping so that both rate and change are in one column better supports ggplot's general preference towards "long" data.
df2 <- reshape2::melt(df, id.vars = c("origin", "Percentage"))
(That can also be done using pivot_wider.)
The plot:
ggplot(df2, aes(value, origin)) +
geom_label(aes(label = value, fill = variable, color = variable)) +
geom_point(aes(size = Percentage), x = max(df2$value) +
20, shape = 21) +
scale_x_continuous(expand = expansion(add = c(15, 25))) +
scale_fill_manual(values = c(change="lightblue", rate="blue")) +
scale_color_manual(values = c(change="black", rate="white")) +
theme_bw() +
theme(panel.border = element_blank(), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank()) +
labs(x = NULL, y = NULL)
The legend and labels can be adjusted in the usual ggplot methods. Overlapping of labels is an issue with which you will need to contend.
Update on OP request: See comments:
gg_dot +
geom_text(aes(x = rate, y = origin,
label = paste0(round(rate, 1), "%")),
col = "black") +
geom_text(aes(x = change, y = origin,
label = paste0(round(change, 1), "%")),
col = "white") +
geom_text(aes(x = x, y = y, label = label, col = label),
data.frame(x = c(40 - 1.1, 180 + 0.6), y = 11,
label = c("change", "rate")), size = 6) +
scale_color_manual(values = c("#9DBEBB", "#468189"), guide = "none") +
scale_y_discrete(expand = c(0.2, 0))
First answer:
Something like this?
library(tidyverse)
library(dslabs)
gg_dot <- df %>%
arrange(rate) %>%
mutate(origin = fct_inorder(origin)) %>%
ggplot() +
# remove axes and superfluous grids
theme_classic() +
theme(axis.title = element_blank(),
axis.ticks.y = element_blank(),
axis.line = element_blank()) +
# add a dummy point for scaling purposes
geom_point(aes(x = 12, y = origin),
size = 0, col = "white") +
# add the horizontal discipline lines
geom_hline(yintercept = 1:10, col = "grey80") +
# add a point for each male success rate
geom_point(aes(x = rate, y = origin),
size = 11, col = "#9DBEBB") +
# add a point for each female success rate
geom_point(aes(x = change, y = origin),
size = 11, col = "#468189")
gg_dot +
geom_text(aes(x = rate, y = origin,
label = paste0(round(rate, 1))),
col = "black") +
geom_text(aes(x = change, y = origin,
label = paste0(round(change, 1))),
col = "white") +
geom_text(aes(x = x, y = y, label = label, col = label),
data.frame(x = c(40 - 1.1, 180 + 0.6), y = 11,
label = c("change", "rate")), size = 6) +
scale_color_manual(values = c("#9DBEBB", "#468189"), guide = "none") +
scale_y_discrete(expand = c(0.2, 0))
Related
I'm new to R and I need your help. I need to remove the point number 8, x = "180" from multiple lines geom_line, but remaining at geom_point. What should you do?
Data is in an excel spreadsheet
data<-melt(CB_fechado, id.vars = 'a');
#Ângulo de incidência de vento
#print(data)
Grafico_CB_fechado <- ggplot(data,aes(x =`Ângulo de incidência de vento [°]`, y=`value`, color=`variable`))+
geom_line() + geom_point()+
scale_x_continuous(limits = c(0,180), breaks = c(0,15,30,45,60,75,90,105,120,135,150,165,180))+
scale_y_continuous(limits = c(-1.5,1.5))+
ylab("b")+theme(legend.position = "bottom")+
theme(legend.title = element_blank())
For exemplo
This is about subsetting the data that you use for the geom_line(). Not that this would be a bit more complex if it were not the last point. Here is an example with similar dummy data since I did not want to type in from the image.
dummy data
data <- data.frame(angle = rep(c(0:6*15, 180), 4),
cat = rep(LETTERS[1:4], each = 8),
value = rep(1:4/-4, each = 8))
subset data
Drop points with angle = 180.
data_lines <- data[data$angle != 180,]
graph it
Use data_lines instead of data in geom_line().
library(ggplot2)
ggplot(data, aes(x = angle, y = value, color= cat)) +
geom_line(data = data_lines) +
geom_point()+
scale_x_continuous(limits = c(0,180), breaks = c(0,15,30,45,60,75,90,105,120,135,150,165,180))+
scale_y_continuous(limits = c(-1.5,1.5))+
ylab("b")+theme(legend.position = "bottom")+
theme(legend.title = element_blank())
The trick is to use a subset of the data in the ggplot call. In this case I use subset to remove the point with a = 180.
Note that I redefine the color argument to "red".
library(ggplot2)
ggplot(data, aes(x = a, y = b, color = "red")) +
geom_point()+
geom_line(data = subset(data, a != 180)) +
scale_x_continuous(limits = c(0,180), breaks = c(0,15,30,45,60,75,90,105,120,135,150,165,180))+
scale_y_continuous(limits = c(-1.5,1.5))+
ylab("b") +
theme(legend.position = "bottom",
legend.title = element_blank())
Data.
data <- read.table(text = "
a b
1 0 0.57395085
2 15 0.47593420
3 30 0.30175686
4 45 0.13363012
5 60 -0.02727459
6 75 -0.17971621
7 90 -0.44955122
8 180 -0.30247414
", header = TRUE)
I am having this strange error regarding displaying the actual bars in a geom_col() plot.
Suppose I have a data set (called user_data) that contains a count of the total number of changes ('adjustments') done for a particular user (and a plethora of other columns). Let's say it looks like this:
User_ID total_adjustments additional column_1 additional column_2 ...
1 'Blah_17' 21 random_data random_data
2 'Blah_1' 47 random_data random_data
3 'foobar' 2 random_data random_data
4 'acbd1' 17 random_data random_data
5 'user27' 9 random_data random_data
I am using the following code to reduce it into a dataframe with only the two columns I care about:
total_adj_count = user_data %>%
select(User_ID, total_adjustments) %>%
arrange(desc(total_adjustments)) %>%
mutate(User_ID = factor(User_ID, User_ID))
This results in my dataframe (total_adj_count) looking like so:
User_ID total_adjustments
1 'Blah_1' 47
2 'Blah_17' 21
3 'acbd1' 17
4 'user27' 9
5 'foobar' 2
Moving along, here is the code I used to attempt to create a geom_col() plot of that data:
g = ggplot(data=total_adj_count, aes(x = User_ID, y = total_adjustments)) +
geom_bar(width=.5, alpha=1, show.legend = FALSE, fill="#000066", stat="identity") +
labs(x="", y="Adjustment Count", caption="(based on sample data)") +
theme_few(base_size = 10) + scale_color_few() +
theme(axis.text.x=element_text(angle = 45, hjust = 1)) +
geom_text(aes(label=round(total_adjustments, digits = 2)), size=3, nudge_y = 2000) +
theme(
axis.text.y = element_blank(),
axis.ticks.y = element_blank())
p = ggplotly(g)
p = p %>%
layout(margin = m,
showlegend = FALSE,
title = "Number of Adjustments per User"
)
p
And for some strange reason when I try to view plot p it displays all parts of the plot as intended, but does not show the actual bars (or columns).
In fact I get this strange plot and am sort of stuck where to fix it:
Change nudge_y argument to a smaller number. Right now you have it set to 2000 which offsets the labels by 2000 on the y-axis. Below I've changed it to nudge_y = 2 and it looks like so:
g <-
ggplot(total_adj_count, aes(User_ID, total_adjustments)) +
geom_col(width = .5, alpha = 1, show.legend = FALSE, fill = "#000066") +
labs(x = "", y = "Adjustment Count", caption = "(based on sample data)") +
theme_few(base_size = 10) +
scale_color_few() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_text(aes(label = round(total_adjustments, digits = 2)), size = 3, nudge_y = 2) +
theme(
axis.text.y = element_blank(),
axis.ticks.y = element_blank()
)
Full copy/paste:
library(ggplot2)
library(ggthemes)
library(plotly)
library(dplyr)
text <- " User_ID total_adjustments
1 'Blah_1' 47
2 'Blah_17' 21
3 'acbd1' 17
4 'user27' 9
5 'foobar' 2"
total_adj_count <- read.table(text = text, header = TRUE, stringsAsFactors = FALSE)
g <-
ggplot(total_adj_count, aes(User_ID, total_adjustments)) +
geom_col(width = .5, alpha = 1, show.legend = FALSE, fill = "#000066") +
labs(x = NULL, y = "Adjustment Count", caption = "(based on sample data)", title = "Number of Adjustments per User") +
theme_few(base_size = 10) +
scale_color_few() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_text(aes(label = round(total_adjustments, digits = 2)), size = 3, nudge_y = 2) +
theme(
axis.text.y = element_blank(),
axis.ticks.y = element_blank()
)
p <- ggplotly(g)
p <- layout(p, showlegend = FALSE)
p
I have several stacked column charts representing drilling profiles. I want to offset the y-position of each Borehole to represent the actual height on the ground.
My Data looks like this:
x layer.thickness layer.depth Petrography BSCategory Offset
0 0.2 0.2 silt Drilling1 0
0 1.0 1.2 gravel Drilling1 0
0 3.0 4.2 silt Drilling1 0
4 0.4 0.4 silt Drilling2 -1
4 0.8 1.2 gravel Drilling2 -1
4 2.0 3.2 sand Drilling2 -1
My minimum working code so far is this:
df <- data.frame(x=c(0,0,0,4,4,4), layer.thickness = c(0.2,1.0,3.0,0.4,0.8,2.0),
layer.depth = c(0.2,1.2,4.2,0.4,1.2,3.2),
Petrography = c("silt", "gravel", "silt", "silt", "gravel", "sand"),
BSCategory = c("Drilling1","Drilling1","Drilling1","Drilling2","Drilling2","Drilling2"),
Offset = c(0,0,0,-1,-1,-1))
# provide a numeric ID that stops grouping individual petrography items
df <- transform(df,ix=as.numeric(factor(df$BSCategory)));
drilling <- ggplot(data = df, aes(x = x, y = layer.thickness, group = ix, fill = Petrography)) +
theme_minimal() +
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.line.x = element_blank(),
axis.ticks.y = element_line(),
aspect.ratio=1) +
geom_col(position = position_stack(reverse = TRUE), width= .15,color="black") +
scale_y_reverse(expand = c(0, 0), name ="Depth [m]") +
scale_x_continuous(position = "top", breaks = df$x, labels=paste(df$BSCategory,"\n",df$x,"m"), name="") +
scale_fill_manual(values = c("gravel"='#f3e03a', "sand"= '#e09637', "silt"='#aba77d'))
print(drilling)
This is my output so far (with red indicating what it should look like):
It may be easier to work with geom_rect here (bar charts are/should be anchored at zero). First we need to calculate y start and end positions for each sample:
library(data.table)
setDT(df)[ , `:=`(start = start <- c(Offset[1] + c(0, cumsum(head(layer.thickness, -1)))),
end = start + layer.thickness), by = BSCategory]
geom_rect has both fill and color aesthetics, which makes it easy to add a border to each individual sample.
w <- 0.5 # adjust to desired width
ggplot(data = df, aes(xmin = x - w / 2, xmax = x + w / 2, ymin = start, ymax = end,
fill = Petrography, group = Petrography)) +
geom_rect(color = "black") +
scale_y_reverse(expand = c(0, 0), name ="Depth [m]") +
scale_x_continuous(position = "top", breaks = df$x, labels = paste(df$BSCategory,"\n", df$x, "m"), name = "") +
scale_fill_manual(values = c("gravel" = '#f3e03a', "sand" = '#e09637', "silt" = '#aba77d')) +
theme_classic() +
theme(axis.line.x = element_blank(),
axis.ticks.x = element_blank())
Alternatively, geom_segment could be used. geom_segment doesn't have a fill aes, so we need to change to color:
ggplot(data = df, aes(x = x, xend = x, y = start, yend = end, color = Petrography, group = ix)) +
geom_segment(size = 5) +
scale_y_reverse(expand = c(0, 0), name ="Depth [m]") +
scale_x_continuous(position = "top", breaks = df$x, labels = paste(df$BSCategory,"\n", df$x, "m"), name = "") +
scale_color_manual(values = c("gravel" = '#f3e03a', "sand" = '#e09637', "silt" = '#aba77d')) +
theme_classic() +
theme(axis.line.x = element_blank(),
axis.ticks.x = element_blank())
To add borders, see Add border to segments in geom_segment.
I'm using ggplot2 to create histograms for two different parameters. My current approach is attached at the end of my question (including a dataset, which can be used and loaded right from pasetbin.com), which creates
a histrogram visualizing the frequency for the spatial distribution of logged user data based on the "location"-attribute (either "WITHIN" or "NOT_WITHIN").
a histogram visualizing the frequency for the distribution of logged user data based on the "context"-attribute (either "Clicked A" or "Clicked B").
This looks like the follwoing:
# Load my example dataset from pastebin
RawDataSet <- read.csv("http://pastebin.com/raw/uKybDy03", sep=";")
# Load packages
library(plyr)
library(dplyr)
library(reshape2)
library(ggplot2)
###### Create Frequency Table for Location-Information
LocationFrequency <- ddply(RawDataSet, .(UserEmail), summarize,
All = length(UserEmail),
Within_area = sum(location=="WITHIN"),
Not_within_area = sum(location=="NOT_WITHIN"))
# Create a column for unique identifiers
LocationFrequency <- mutate(LocationFrequency, id = rownames(LocationFrequency))
# Reorder columns
LocationFrequency <- LocationFrequency[,c(5,1:4)]
# Format id-column as numbers (not as string)
LocationFrequency[,c(1)] <- sapply(LocationFrequency[, c(1)], as.numeric)
# Melt data
LocationFrequency.m = melt(LocationFrequency, id.var=c("UserEmail","All","id"))
# Plot data
p <- ggplot(LocationFrequency.m, aes(x=id, y=value, fill=variable)) +
geom_bar(stat="identity") +
theme_grey(base_size = 16)+
labs(title="Histogram showing the distribution of all spatial information per user.") +
labs(x="User", y="Number of notifications interaction within/not within the area") +
# using IDs instead of UserEmail
scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30), labels=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30"))
# Change legend Title
p + labs(fill = "Type of location")
##### Create Frequency Table for Interaction-Information
InterationFrequency <- ddply(RawDataSet, .(UserEmail), summarize,
All = length(UserEmail),
Clicked_A = sum(context=="Clicked A"),
Clicked_B = sum(context=="Clicked B"))
# Create a column for unique identifiers
InterationFrequency <- mutate(InterationFrequency, id = rownames(InterationFrequency))
# Reorder columns
InterationFrequency <- InterationFrequency[,c(5,1:4)]
# Format id-column as numbers (not as string)
InterationFrequency[,c(1)] <- sapply(InterationFrequency[, c(1)], as.numeric)
# Melt data
InterationFrequency.m = melt(InterationFrequency, id.var=c("UserEmail","All","id"))
# Plot data
p <- ggplot(InterationFrequency.m, aes(x=id, y=value, fill=variable)) +
geom_bar(stat="identity") +
theme_grey(base_size = 16)+
labs(title="Histogram showing the distribution of all interaction types per user.") +
labs(x="User", y="Number of interaction") +
# using IDs instead of UserEmail
scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30), labels=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30"))
# Change legend Title
p + labs(fill = "Type of interaction")
But what I'm trying to realize: How can I combine both histograms in only one plot? Would it be somehow possible to place the corressponding percentage for each part? Somethink like the following sketch, which represents the total number of observations per user (the complete height of the bar) and using the different segmentation to visualize the corresponding data. Each bar would be divided into to parts (within and not_within) where each part would be then divided into two subparts showing the percentage of the interaction types (*Clicked A' or Clicked B).
With the update description, I would make a combined barplot with two parts: a negative and a positve one. In order to achieve that, you have to get your data into the correct format:
# load needed libraries
library(dplyr)
library(tidyr)
library(ggplot2)
# summarise your data
new.df <- RawDataSet %>%
group_by(UserEmail,location,context) %>%
tally() %>%
mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>%
group_by(UserEmail,location) %>%
mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n))
The new.df dataframe looks like:
> new.df
Source: local data frame [90 x 6]
Groups: UserEmail, location [54]
UserEmail location context n n2 p
(fctr) (fctr) (fctr) (int) (dbl) (dbl)
1 andre NOT_WITHIN Clicked A 3 -3 -1.0000000
2 bibi NOT_WITHIN Clicked A 4 -4 -0.5000000
3 bibi NOT_WITHIN Clicked B 4 -4 -0.5000000
4 bibi WITHIN Clicked A 9 9 0.6000000
5 bibi WITHIN Clicked B 6 6 0.4000000
6 corinn NOT_WITHIN Clicked A 10 -10 -0.5882353
7 corinn NOT_WITHIN Clicked B 7 -7 -0.4117647
8 corinn WITHIN Clicked A 9 9 0.7500000
9 corinn WITHIN Clicked B 3 3 0.2500000
10 dpfeifer NOT_WITHIN Clicked A 7 -7 -1.0000000
.. ... ... ... ... ... ...
Next you can create a plot with:
ggplot() +
geom_bar(data = new.df[new.df$location == "NOT_WITHIN",],
aes(x = UserEmail, y = n2, color = "darkgreen", fill = context),
size = 1, stat = "identity", width = 0.7) +
geom_bar(data = new.df[new.df$location == "WITHIN",],
aes(x = UserEmail, y = n2, color = "darkred", fill = context),
size = 1, stat = "identity", width = 0.7) +
scale_y_continuous(breaks = seq(-20,20,5),
labels = c(20,15,10,5,0,5,10,15,20)) +
scale_color_manual("Location of interaction",
values = c("darkgreen","darkred"),
labels = c("NOT_WITHIN","WITHIN")) +
scale_fill_manual("Type of interaction",
values = c("lightyellow","lightblue"),
labels = c("Clicked A","Clicked B")) +
guides(color = guide_legend(override.aes = list(color = c("darkred","darkgreen"),
fill = NA, size = 2), reverse = TRUE),
fill = guide_legend(override.aes = list(fill = c("lightyellow","lightblue"),
color = "black", size = 0.5))) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 14),
axis.title = element_blank(),
legend.title = element_text(face = "italic", size = 14),
legend.key.size = unit(1, "lines"),
legend.text = element_text(size = 11))
which results in:
If you want to use percentage values, you can use the p-column to make a plot:
ggplot() +
geom_bar(data = new.df[new.df$location == "NOT_WITHIN",],
aes(x = UserEmail, y = p, color = "darkgreen", fill = context),
size = 1, stat = "identity", width = 0.7) +
geom_bar(data = new.df[new.df$location == "WITHIN",],
aes(x = UserEmail, y = p, color = "darkred", fill = context),
size = 1, stat = "identity", width = 0.7) +
scale_y_continuous(breaks = c(-1,-0.75,-0.5,-0.25,0,0.25,0.5,0.75,1),
labels = scales::percent(c(1,0.75,0.5,0.25,0,0.25,0.5,0.75,1))) +
scale_color_manual("Location of interaction",
values = c("darkgreen","darkred"),
labels = c("NOT_WITHIN","WITHIN")) +
scale_fill_manual("Type of interaction",
values = c("lightyellow","lightblue"),
labels = c("Clicked A","Clicked B")) +
coord_flip() +
guides(color = guide_legend(override.aes = list(color = c("darkred","darkgreen"),
fill = NA, size = 2), reverse = TRUE),
fill = guide_legend(override.aes = list(fill = c("lightyellow","lightblue"),
color = "black", size = 0.5))) +
theme_minimal(base_size = 14) +
theme(axis.title = element_blank(),
legend.title = element_text(face = "italic", size = 14),
legend.key.size = unit(1, "lines"),
legend.text = element_text(size = 11))
which results in:
In response to the comment
If you want to place the text-labels inside the bars, you will have to calculate a position variable too:
new.df <- RawDataSet %>%
group_by(UserEmail,location,context) %>%
tally() %>%
mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>%
group_by(UserEmail,location) %>%
mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n),
pos = (context=="Clicked A")*p/2 + (context=="Clicked B")*(c(1,-1)[(location=="NOT_WITHIN")+1L] * (1 - abs(p)/2)))
Then add the following line to your ggplot code after the geom_bar's:
geom_text(data = new.df, aes(x = UserEmail, y = pos, label = n))
which results in:
Instead of label = n you can also use label = scales::percent(abs(p)) to display the percentages.
I'm trying to overlay 2 the bars from geom_bar derived from 2 separate data.frames.
dEQ
lab perc
1 lmP 55.9
2 lmN 21.8
3 Nt 0.6
4 expG 5.6
5 expD 0.0
6 prbN 11.2
7 prbP 5.0
and
LMD
lab perc
1 lmP 16.8
2 lmN 8.9
3 Nt 0.0
4 expG 0.0
5 expD 0.0
6 prbN 0.0
7 prbP 0.0
The first plot is:
p <- ggplot(dEQ, aes(lab, perc)) +
xlab(xlabel) + ylab(ylabel) +
geom_bar(stat="identity", colour="blue", fill="darkblue") +
geom_text(aes(vecX, vecYEQ+1.5, label=vecYlbEQ), data=dEQ, size=8.5) +
theme_bw() +
opts(axis.text.x = theme_text(size = 20, face = "bold", colour = "black")) +
opts(axis.text.y = theme_text(size = 20, face = "bold", colour = "black")) +
coord_flip() +
scale_y_continuous(breaks=c(0,10,20,30,40,50,60),
labels=c("0","","20","","40","","60"),
limits = c(0, 64), expand = c(0,0))
print(p)
but I want to overplot with another geom_bar from data.frame LMD
ggplot(LMD, aes(lab, perc)) +
geom_bar(stat="identity", colour="blue", fill="red", add=T)
and I want to have a legend.
here is an example:
p <- ggplot(NULL, aes(lab, perc)) +
geom_bar(aes(fill = "dEQ"), data = dEQ, alpha = 0.5) +
geom_bar(aes(fill = "LMD"), data = LMD, alpha = 0.5)
p
but I recommend to rbind them and plot it by dodging:
dEQ$name <- "dEQ"
LMD$name <- "LMD"
d <- rbind(dEQ, LMD)
p <- ggplot(d, aes(lab, perc, fill = name)) + geom_bar(position = "dodge")
Though the answer is not directly the requirement of OP, but as this question is linked to many subsequent questions on SO that have been closed by giving the link of this question, I am proposing a method for bar(s) within bar plot construction method in ggplot2.
Example for two bars (group-wise division) within one bigger bar plot.
library(tidyverse)
set.seed(40)
df <- data_frame(name = LETTERS[1:10], provision = rnorm(mean = 100, sd = 20, n = 10),
expenditure = provision - rnorm(mean = 25, sd = 10, n = 10))
df %>% mutate(savings = provision - expenditure) %>%
pivot_longer(cols = c("expenditure", "savings"), names_to = "Exp", values_to = "val") %>%
ggplot() + geom_bar(aes(x= name, y = provision/2), stat = "identity", fill = "blue", width = 0.9, alpha = 0.3) +
geom_col(aes(x=name,y=val, fill = Exp), position ="dodge", width = 0.7) +
scale_y_continuous(name = "Amount in \u20b9")
Another option to overlay your bars without lowering transparency using alpha is to group_by the data based on your fill variable and arrange(desc()) your y variable, using position = position_identity() to overlay your bars and have the highest value bars behind and lower values in front. Then you don't need to change the transparency. Here is a reproducible example:
# Add name for fill aesthetic
dEQ$name <- "dEQ"
LMD$name <- "LMD"
library(dplyr)
library(ggplot2)
dEQ %>%
rbind(LMD) %>%
group_by(name) %>%
arrange(desc(perc)) %>%
ggplot(aes(x = lab, y = perc, fill = name)) +
geom_bar(stat="identity", position = position_identity())
Created on 2022-11-02 with reprex v2.0.2
As you can see the bars overlay while keeping the origin transparency.