Combining position_dodge and position_fill in ggplot2 - r

What I would like to do is use both the position = "fill" and the position = "dodge" arguments of geom_bar() at the same time somehow. Using some sample data
set.seed(1234)
df <- data.frame(
Id = rep(1:10, each = 12),
Month = rep(1:12, times = 10),
Value = sample(1:2, 10 * 12, replace = TRUE)
)
I'm able to create the following graph
df.plot <- ggplot(df, aes(x = as.factor(Month), fill = as.factor(Value))) +
geom_bar(position = "fill") +
scale_x_discrete(breaks = 1:12) +
scale_y_continuous(labels = percent) +
labs(x = "Month", y = "Value")
I like the scaling and labeling of this graph but I want to be able to unstack it. However when I do the following
df.plot2 <- ggplot(df, aes(x = as.factor(Month), fill = as.factor(Value))) +
geom_bar(position = "dodge", aes(y = (..count..)/sum(..count..))) +
scale_x_discrete(breaks = 1:12) +
scale_y_continuous(labels = percent) +
labs(x = "Month", y = "Value")
The bars are in the position and scaling that I want but the y-axis labels represent the percentage of each bar relative to the total count, not the count within each month.
All in all I want the visuals of the second graph with the labeling of the first graph. Is there a relatively easy way to automate this?

Expanding on my comment:
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
df1 <- df %>%
group_by(Month) %>%
summarise(Value1 = sum(Value == 1) / n(),
Value2 = sum(Value == 2) / n()) %>%
gather(key = Group,value = Val,Value1:Value2)
df.plot2 <- ggplot(df1, aes(x = as.factor(Month),
y = Val,
fill = as.factor(Group))) +
geom_bar(position = "dodge",stat = "identity") +
scale_y_continuous(labels = percent_format()) +
scale_x_discrete(breaks = 1:12) +
labs(x = "Month", y = "Value")

Related

R: How to set breaks of factorial x - axis in ggplot?

How can I set the break of the x-axis and drop every second factor? And also how can I adjust hover of ggplotly from factor(Year) to Year?
data <- data.frame (Year = c("2017","2017","2017","2016","2016","2016","2015","2015","2015","2018" ,"2018" ,"2018"),
condition = c("normal","stress","Nitrogen" ,"normal","stress", "Nitrogen","normal","stress","Nitrogen","normal","stress","Nitrogen"),
value = c(22.221268, 1.598309 ,20.560815 ,17.337966,20.440174 , 9.074674, 11.739466, 1.905651, 32.270223, 14.271606 ,12.375446, 17.470793))
library(tidyverse)
data %>%
group_by(Year) %>%
mutate(value = value / sum(value)) %>%
ggplot(aes(fill=condition, y=value, x=factor(Year))) +
geom_col(position="fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_fill_brewer(palette = "Set1")
How to show every other value on a discrete axis is a duplicate of this question. Using my answer from there, we can define the every_nth function. As for the factor(Year) tooltip label, the easiest way to avoid that is to convert the column to factor before plotting, so the aesthetic mapping is simply x = Year.
every_nth = function(n) {
return(function(x) {x[c(TRUE, rep(FALSE, n - 1))]})
}
data %>%
group_by(Year) %>%
mutate(
value = value / sum(value),
Year = factor(Year) ## put this in mutate() before plotting
) %>%
ggplot(aes(fill = condition, y = value, x = Year)) +
geom_col(position = "fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks = every_nth(2)) +
scale_fill_brewer(palette = "Set1") -> p
ggplotly(p)

R ggplot: Combine a barplot and a line chart from a long dataset

I have a wide dataset that records the blood glucose values from 10 subjects.
library(dplyr)
df_wide = data.frame(
ID = seq(1, 10),
gender = sample(0:1, 10, replace = T),
glucose_0 = sample(100:125, 10, replace = T),
glucose_60 = sample(180:200, 10, replace = T),
glucose_120 = sample(130:160, 10, replace = T),
glucose_180 = sample(100:125, 10, replace = T)
)
I then transformed it into a long dataset using gather:
df_long = df_wide %>%
gather("glucose_0", "glucose_60", "glucose_120", "glucose_180", key = Time, value = glucose) %>%
arrange(ID)
To show how the glucose values changed from 0 min to 180 min, I then made the following line chart:
df_long %>%
ggplot(aes(x = Time, y = glucose, group = ID)) +
geom_line(aes(linetype = as.factor(gender))) +
geom_point() +
theme_classic() +
scale_x_discrete(limits = c("glucose_0", "glucose_60", "glucose_120", "glucose_180"),
labels = c("0", "60", "120", "180")) +
theme(legend.position = "bottom") +
labs(
x = "Time",
y = "Glucose",
fill = "Gender"
)
Finally, to show the glucose at each time point, I also made a barplot:
df_long %>%
ggplot(aes(x = Time, y = glucose, fill = as.factor(gender))) +
geom_bar(stat = 'identity', position = position_dodge()) +
theme_classic() +
scale_x_discrete(limits = c("glucose_0", "glucose_60", "glucose_120", "glucose_180"))
My question is: How to combine the line chart and the barplot into one figure that looks like this?
In order to plot the mean glucose levels as both bar and line
df_long %>%
group_by(gender, Time) %>%
mutate(glucose = mean(glucose)) %>%
ggplot(aes(x = Time, y = glucose, fill = as.factor(gender))) +
geom_bar(stat = 'identity', position = position_dodge()) +
geom_line(aes(linetype=as.factor(gender), group=ID)) +
theme_classic() +
scale_x_discrete(limits = c("glucose_0", "glucose_60", "glucose_120", "glucose_180"))
Are you looking for such a solution?
library(tidyverse)
df_wide %>%
pivot_longer(
starts_with("glucose")
) %>%
mutate(gender = fct_inorder(factor(gender))) %>%
arrange(ID) %>%
ggplot(aes(x = name, y = value)) +
geom_col(aes(fill = gender, group=gender), width = 0.5, position = position_dodge())+
stat_summary(aes(group = gender), fun = mean, geom = 'line', size=1, alpha=0.9) +
stat_summary(aes(group = gender), fun = mean, geom = 'point', size=2, alpha=0.9) +
theme_classic() +
scale_x_discrete(limits = c("glucose_0", "glucose_60", "glucose_120", "glucose_180"),
labels = c("0", "60", "120", "180")) +
theme(legend.position = "bottom") +
labs(
x = "Time",
y = "Glucose",
fill = "Gender"
)

Use count in the y-axis but percentages and counts as labels

I am trying to create a bar chart in R from a data frame, which has counts in the y-axis but displays as labels a concatenation of percentages and counts.
My data frame looks as below:
ID Response
1 No
2 Yes
3 No
.. ..
The end result I would like to have would be a chart as the one below
This should get you going:
library(tidyverse)
df %>%
group_by(Response) %>%
summarise(count = n()) %>%
mutate(Label = paste0(count, " - ", round(count / sum(count) * 100, 2), "%")) %>%
ggplot(aes(x = Response, y = count)) +
geom_bar(stat = 'identity', fill = 'lightblue') +
geom_text(aes(label = Label)) +
theme_minimal()
A solution as above can be to create a Label column which you can then pass to geom_text if needed.
A dummy data frame:
df <- data.frame(
ID = c(1:100),
Response = c(rep("Yes", 60), rep("No", 40))
)
I'd try something like the below. It's awesome that you're using summarize and mutate; I guess by habit I sometimes use base functions like table.
library(tidyverse)
resps<-sample(c("yes", "no"), 850, replace=T)
percents<-round(100*table(resps)/length(resps),2)
counts<-as.numeric(table(resps))
plotdat<-data.frame(percents, counts=counts, response=rownames(percents))
plotdat %>% ggplot(aes(response, counts)) +
geom_col()+
geom_text(aes(y=counts+10), label=paste(percents,"% ", counts))
labs(y="respondents")+
theme_classic()
This is a helpful solution from another question on SO:
library(ggplot2)
library(scales)
data.frame(response = sample(c("Yes", "No"), size = 100, replace = T, prob = c(0.4, 0.6))) %>%
ggplot(aes(x = response)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
geom_text(aes(y = ((..count..)/sum(..count..)),
label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25) +
scale_y_continuous(labels = percent) +
labs(title = "Proportion of Responses", y = "Percent", x = "Response")

Combined geom_bar and geom_point legend in ggplotly

I am trying to get a combined bar + point chart with a legend for both bars different Indicators) and points (a change in the Indicator). I tried to follow along with ggplot2 legend for plot combining geom_bar and geom_point and introduced a shape into my geom_point (without doing that I could not get a legend for points).
library(ggplot2)
library(dplyr)
library(ggthemes)
library(plotly)
set.seed(369)
obs <- 6
values1 <- c(round(100 + rnorm(obs) * 10, 2))
values2 <- c(round(100 + rnorm(obs) * 10, 2))
df <- data.frame(Year = rep(2014:2019, 2*2),
value = c(rep(values1, 2), rep(values2, 2)),
Indicator = rep(c("Indicator1", "Indicator2"), each = obs * 2),
Type = rep(c("Bar", "Point"), each = obs))
p <- ggplot(df, aes(value))
bars <- df %>%
filter(Type == "Bar")
points <- df %>%
filter(Type == "Point")
pl <- p +
geom_bar(data = bars,
aes(fill = Indicator, group = Indicator, x = Year, y = value), stat = "identity", position = "dodge") +
geom_point(data = points, aes(x = Year, y = value, group = Indicator, fill = Indicator, shape = "Change"), position = position_dodge(width = 0.9)) +
theme_tufte()
p
ggplotly(pl, tooltip = c("value"))
ggplotly has the output I want, however the legend has a strange grouping. Is there a way to fix the legend in the chart below?
there's probably a better way, but how's this:
library(tidyverse)
obs <- 6
values1 <- c(round(100 + rnorm(obs) * 10, 2))
values2 <- c(round(100 + rnorm(obs) * 10, 2))
df <- data.frame(Year = rep(2014:2019, 2*2),
value = c(rep(values1, 2), rep(values2, 2)),
Indicator = rep(c("Indicator1", "Indicator2"), each = obs * 2),
Type = rep(c("Bar", "Point"), each = obs))
bars <- df %>% filter(Type == "Bar")
points <- df %>% filter(Type == "Point") %>% mutate(Year =
ifelse(Indicator == "Indicator1", Year - 0.25, Year + 0.25))
p <- ggplot(bars, aes(fill = Indicator, group = Indicator, x = Year, y = value)) +
geom_bar(stat = "identity", position = "dodge", width = 1)
p <- p + geom_point(data = points, mapping = aes(fill = Indicator, x =
Year, y = value), shape = 21) + labs(x = "value") + labs(y = "value")
p
I don't know ggplotly() , but building separate geom_bar() and geom_point() plots, and then using get_legend() to remove each legend, and then building them back with plot_grid with the full plot seems a decent option.
library(tidyverse)
obs <- 6
values1 <- c(round(100 + rnorm(obs) * 10, 2))
values2 <- c(round(100 + rnorm(obs) * 10, 2))
df <- data.frame(Year = rep(2014:2019, 2*2),
value = c(rep(values1, 2), rep(values2, 2)),
Indicator = rep(c("Indicator1", "Indicator2"), each = obs * 2),
Type = rep(c("Bar", "Point"), each = obs))
bars <- df %>% filter(Type == "Bar")
points <- df %>% filter(Type == "Point") %>% mutate(Year =
ifelse(Indicator == "Indicator1", Year - 0.25, Year + 0.25),
IndicatorChange = Indicator)
p1 <- ggplot(points, mapping = aes(fill = IndicatorChange, x = Year, y = value )) + labs(x = "value") + labs(y = "value") +
geom_point(shape = 21)
p1_leg <- get_legend(p1)
p2 <- ggplot(bars, aes(fill = Indicator, group = Indicator, x = Year, y = value)) +
geom_bar(stat = "identity", position = "dodge")
p2_leg <- get_legend(p2)
p_leg <- plot_grid(p1_leg, p2_leg, ncol = 1, nrow = 5) #toggle nrow to get right spacing between legends
p3 <-ggplot(bars, aes(fill = Indicator, group = Indicator, x = Year, y = value)) + geom_bar(stat = "identity", position = "dodge", width = 1)
p3 <- p3 + geom_point(data = points, mapping = aes(fill = Indicator, x = Year, y = value), shape = 21) +
labs(x = "value") + labs(y = "value")
p3 <- p3 + theme(legend.position="none")
p3
p <- plot_grid(p3, p_leg, ncol =2, nrow =2) #more toggling possible
p
I don't know whether this is what you want(although the font size of the legend should be modified):
library(ggplot2)
library(dplyr)
library(ggthemes)
library(plotly)
set.seed(369)
obs <- 6
values1 <- c(round(100 + rnorm(obs) * 10, 2))
values2 <- c(round(100 + rnorm(obs) * 10, 2))
df <- data.frame(Year = rep(2014:2019, 2*2),
value = c(rep(values1, 2), rep(values2, 2)),
Indicator = rep(c("Indicator1", "Indicator2"), each = obs * 2),
Type = rep(c("Bar", "Point"), each = obs))
p <- ggplot(df, aes(value))
bars <- df %>%
filter(Type == "Bar")
points <- df %>%
filter(Type == "Point")
points$Type1=paste(points$Indicator,"change",sep=",")
pl <- p +
geom_bar(data = bars,
aes(fill = Indicator, group = Indicator, x = Year, y = value), stat = "identity", position = "dodge") +
geom_point(data = points,
aes(x = Year, y = value, group = Indicator, fill = Indicator, shape = "Change"),
position = position_dodge(width = 0.9)) +
theme_tufte()+
theme(legend.position="bottom")
pl <- p +
geom_bar(data = bars,
aes(fill = Indicator, group = Indicator,x = Year, y = value), stat = "identity", position = "dodge") +
geom_point(data = points,
aes(x = Year, y = value,shape = Type1),
position = position_dodge(width = 0.9)) +
theme_tufte()+
theme(legend.position="bottom",
legend.title=element_blank())
p

ggplot2: show relative % in a stacked barplot per group

I'm trying to plot a basic bar chart per group.
As values are pretty big, I want to show for each bar (i.e. group) the % of each group within the bar.
I managed to show percentage of the total, but this is not what I'm expecting : in each bar, I would like that the sum of % equal 100%.
Is there an easy way to do it without changing the dataframe ?
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(aes(x = year, y = Value, group = Grp,
label = percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
You can create a new variable for percentile by year:
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, ValuePer, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(ValuePer)),
position = position_fill())+
scale_y_continuous(labels = percent_format())
Use position = "fill" to turn scale into proportions and scale_y_continuous(labels = percent_format()) to turn this scale into percent.
DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000)))
library(ggplot2)
library(scales)
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(Value / sum(Value))),
position = position_fill()) +
scale_y_continuous(labels = percent_format())
OK gathering all your tricks, I finally get this :
I need to adjust my DF, what I wanted to avoid, but it remains simple so it works
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "stack") +
geom_text(aes(label = percent(ValuePer)),
position = position_stack()) +
scale_y_continuous(labels = unit_format("M", 1e-6) )
I would use a single geom_text for each bar while filtering data by year (bar) using dplyr. Check if is that what you need:
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
library(dplyr)
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(data = DF %>% filter(year == 2015),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2016),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2017),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
Argument group is not necessary here. There may be more elegant solutions but that is the one I could think about. Tell me if this is the output you were waiting for:
Maybe creating a new column doing the right computation. I could not figure out how the computation could be done right inside aes(), the way you did you just computed the overall %, the Value should be grouped by year instead.
At least you got yourself the actually value by the Y axis and the Year grouped % inside bars. I would advise changing this labels by stacking something like this:
scale_y_continuous(breaks = seq(0,8*10^6,10^6),
labels = c(0, paste(seq(1,8,1),'M')))
Resulting this:
You can adapt to your context.

Resources