Automated option to make bars to show their result - r

I have a stacked bar plot like this:
library(ggplot2)
Year <- c(rep(c("2006-07", "2007-08", "2008-09", "2009-10"), each = 4))
Category <- c(rep(c("A", "B", "C", "D"), times = 4))
Frequency <- c(1, 1, 8, 32, 216, 431, 319, 368, 423, 645, 234, 685, 166, 467, 274, 251)
Data <- data.frame(Year, Category, Frequency)
ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
Is there any automated option to make the bars show there results with a clear way even if their frequency is 1 like the year of 2012?

There's no ideal solution to showing this tidily on a plot. You could use geom_label_repel from ggrepel:
library(ggplot2)
library(ggrepel)
ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
geom_bar(stat = "identity") +
geom_label_repel(size = 3, position = position_stack(vjust = 0.5))
or facet with free scales:
ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5)) +
facet_wrap(.~Year, drop = TRUE, nrow = 1, scales = "free") +
theme(strip.background = element_blank(), strip.text = element_blank())
Or perhaps a facet_zoom from ggforce:
ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5)) +
ggforce::facet_zoom(ylim = c(0, 50))
Or have floating labels:
ggplot(Data, aes(x = Year, y = Frequency, fill = Category, label = Frequency)) +
geom_bar(stat = "identity") +
geom_text(data = within(Data, Frequency[Year == "2006-07"] <- NA), size = 3,
position = position_stack(vjust = 0.5)) +
geom_label(data = Data[1:4,], aes(y = 1:4 * 100),
position = "stack"
Personally, I think I'd go with a table here...

Related

ggplot: Order stacked barplots by variable proportion

I am creating a plot with 3 variables as below. Is there a way to arrange the plot in a descending order such that the bar with the highest proportion of variable "c" comes first in the plot. Using this example last bar should come in first then middle one and then the first bar in the last.
long<- data.frame(
Name = c("abc","abc","abc","gif","gif","gif","xyz","xyz","xyz"),
variable = c("a","b","c","a","b","c","c","b","a"),
value = c(4,6,NA,2,8,1,6,NA,NA))
long_totals <- long %>%
group_by(Name) %>%
summarise(Total = sum(value, na.rm = T))
p <- ggplot()+
geom_bar(data = long,
aes(x = Name,
y = value,
fill=variable),
stat="summary",
position = "fill") +
geom_text(data = long_totals,
aes(y = 100,
x = Name,
label = Total),
size = 7,
position = position_fill(vjust = 1.02)) +
scale_y_continuous(labels = scales::percent_format()) +
ylab("Total_num") +
ggtitle("Totalnum") +
theme(plot.title = element_text(size = 20, hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 75, vjust = 0.95, hjust=1))
The following code does arrange the bars by count of "c" but not by proportion. How can I arrange by proportion?
p<-long %>%
mutate(variable = fct_relevel(variable,
c("c", "b", "a"))) %>%
arrange(variable) %>%
mutate(Name = fct_inorder(Name))
p %>%
ggplot() +
aes(x = Name,
y = value,
fill = variable) +
geom_bar(position = "fill",
stat = "summary") +
We could use fct_rev from forcats package, it is in tidyverse:
p <- ggplot()+
geom_bar(data = long,
aes(x = fct_rev(Name),
y = value,
fill=variable),
stat="summary",
position = "fill") +
geom_text(data = long_totals,
aes(y = 100,
x = Name,
label = Total),
size = 7,
position = position_fill(vjust = 1.02)) +
scale_y_continuous(labels = scales::percent_format()) +
ylab("Total_num") +
ggtitle("Totalnum") +
theme(plot.title = element_text(size = 20, hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 75, vjust = 0.95, hjust=1))

X axis and right box in stacked bars

In a plot like this
library(ggplot2)
df <- data.frame(class = c("a","b","a","b"), date = c(2009,2009,2010,2010), volume=c(1,1,2,0))
df <- df %>% group_by(date) %>% mutate(volumep = 100 * volume/sum(volume))
ggplot(df, aes(x = date, y = volumep, fill = class, label = volumep)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5)) + coord_flip()
How is it possible to increase the text in the boxes in the right (class) and how to make the x axis have 0, 25, 50 and 100 values?
To answer the question, just adjust the involved aesthetics, y and size.
ggplot(df, aes(x = date, y = 100*volume, fill = class, label = volume)) +
geom_bar(stat = "identity") +
geom_text(size = c(3, 3, 5, 5), position = position_stack(vjust = 0.5)) +
coord_flip() +
ylab("volume")
Another option is to mutate the values of volume first. In this case, there would be no need to manually set the y axis label.
After the question's edit, the code is now as follows.
library(ggplot2)
library(dplyr)
df %>%
group_by(date) %>%
mutate(volume = 100*volume/sum(volume)) %>%
ggplot(aes(x = date, y = volume, fill = class, label = volume)) +
geom_bar(stat = "identity") +
geom_text(size = c(3, 3, 5, 5), position = position_stack(vjust = 0.5)) +
coord_flip()
ggplot(df, aes(x = date, y = volumnep, fill = class, label = volumnep)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5)) +
coord_flip() +
theme(legend.title=element_text(size=22),
legend.text=element_text(size=22)) +
scale_y_continuous(breaks=c(0,25, 50, 100))
Edit:
I'd suggest recasting date as a factor:
ggplot(df, aes(x = factor(date), y = volumnep, fill = class, label = volumnep)) +
geom_bar(stat = "identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5)) +
coord_flip() +
theme(legend.title=element_text(size=22),
legend.text=element_text(size=22)) +
scale_y_continuous(breaks=c(0,25, 50, 100)) +
labs(y="date")

Why does the value of y increase in geom_bar(stat = "identity") for a stacked bar chart with a factor grouping?

In the following example, I think the height of the bars should be 500 and 45. Why are they both over 1000?
library(tidyverse)
dat <- tibble(
x = c(1, 1, 2, 2, 2),
grp = factor(c(0, 1, 0, 1, 2), levels = 0:2),
y = c(200, 300, 25, 15, 5)
)
ggplot(dat, aes(x = x, y = y, fill = grp)) +
geom_bar(stat = "identity") +
scale_y_log10(labels = scales::comma)
If you use position = "dodge", the y-axis values appear to be correct.
ggplot(dat, aes(x = x, y = y, fill = grp)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_log10(labels = scales::comma)
Any ideas where ggplot2 is getting its y-axis values in the first plot?

ggplot2 coord_polar preserve order when using fill

Specifying the fill argument for aes results in a reverse order of the pie chart, so the breaks/labels wont match with pie pieces anymore. Please see the example and resulting plots below.
df = data.frame(Var1 = letters[1:5], Var2 = c(6, 31, 34, 66, 77))
df$Var1 = factor(df$Var1, levels = df$Var1, ordered = T)
# just fine, but no colors
ggplot(df, aes(x = 1,
y = Var2)) +
geom_bar(width = 1, stat = "identity") +
coord_polar(theta = "y") +
scale_fill_manual(values = c("red","green","yellow","black","white"),
guide_legend(title = "My_Title")) +
scale_y_continuous(breaks = (cumsum(df$Var2) -
df$Var2 / 2),
labels = df$Var1)
# reverse order appears
ggplot(df, aes(x = 1,
y = Var2,
fill = Var1)) +
geom_bar(width = 1, stat = "identity") +
coord_polar(theta = "y") +
scale_fill_manual(values = c("red","green","yellow","black","white"),
guide_legend(title = "My_Title")) +
scale_y_continuous(breaks = (cumsum(df$Var2) -
df$Var2 / 2),
labels = df$Var1)
Stacking will occur in reversed factor order (per v2.2.0), and therefore we can use the following code to stack in original order:
ggplot(df, aes(x = 1,
y = Var2,
fill = forcats::fct_rev(Var1))) +
geom_bar(width = 1, stat = "identity", col = 1) +
coord_polar(theta = "y") +
scale_y_continuous(breaks = (cumsum(df$Var2) -
df$Var2 / 2),
labels = df$Var1)
Also, you may use geom_col instead of geom_bar(stat = "identity").
Another option to reverse the order of the stack would be to make use of position_stack(reverse=TRUE):
df = data.frame(Var1 = letters[1:5], Var2 = c(6, 31, 34, 66, 77))
df$Var1 = factor(df$Var1, levels = df$Var1, ordered = T)
library(ggplot2)
# reverse order appears
ggplot(df, aes(x = 1,
y = Var2,
fill = Var1)) +
geom_bar(width = 1,
stat = "identity",
position = position_stack(reverse = TRUE)) +
coord_polar(theta = "y") +
scale_fill_manual(values = c("red","green","yellow","black","white"),
guide_legend(title = "My_Title")) +
scale_y_continuous(breaks = (cumsum(df$Var2) - df$Var2 / 2), labels = df$Var1)

Count and Percent Together using Stack Bar in R

I am trying to create stack bar with counts and percent in same graph. I took help from Showing data values on stacked bar chart in ggplot2 and add group total and plotted my as
By using code
### to plot stacked bar graph with total on the top and
### distribution of the frequency;
library(ggplot2);
library(plyr);
library(dplyr);
Year <- c(rep(c("2006-07", "2007-08", "2008-09", "2009-10"), each = 4))
Category <- c(rep(c("A", "B", "C", "D"), times = 4))
Frequency <- c(168, 259, 226, 340, 216, 431, 319, 368, 423, 645, 234, 685, 166, 467, 274, 251)
Data <- data.frame(Year, Category, Frequency);
sum_count <-
Data %>%
group_by(Year) %>%
summarise(max_pos = sum(Frequency));
sum_count;
Data <- ddply(Data, .(Year), transform, pos =
cumsum(Frequency) - (0.5 * Frequency));
Data;
# plot bars and add text
p <- ggplot(Data, aes(x = Year, y = Frequency)) +
geom_bar(aes(fill = Category), stat="identity") +
geom_text(aes(label=Frequency,y = pos), size = 3) +
geom_text(data = sum_count,
aes(y = max_pos, label = max_pos), size = 4,
vjust = -0.5);
print(p);
/Now I want to overlay percent of each group with counts This is my approach.merge data such a way that we can calculate
% for each of the group you are dealing with/
MergeData <- merge(Data,sum_count,by="Year");
MergeData <- transform(MergeData,
per_cent=round((pos/max_pos)*100,0));
MergeData<- ddply(MergeData, .(Year), transform, per_pos =
cumsum(per_cent) - (0.5 * per_cent));
# calculate percent and attach % sign;
MergeData <- transform(MergeData,
per_cent=paste(round((pos/max_pos)*100,0),"%"));
# Data only with percents
Percent_Data <- subset(MergeData,select
= c("Year","Category","per_cent","per_pos"));
/I am wondering if it is possible to overlay percent data to the image I created using previous code so that number and percent can be presented together./
I think you are almost there.
Use MergeData as the source for the data frame and add one more call to geom_text
p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
geom_bar(aes(fill = Category), stat="identity") +
geom_text(aes(label=Frequency,y = pos), size = 3, vjust = 1) +
geom_text(
aes(y = max_pos, label = max_pos), size = 4,
vjust = -.5) +
geom_text(aes(x = Year, y = pos, label = per_cent), vjust = -1, size = 4)
print(p);
You may need to fiddle with hjust and vjust to get the text just how you like it.
Thank you for your response. I think it is very good.
p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
geom_bar(aes(fill = Category), stat="identity") +
geom_text(aes(label=Frequency,y = pos), vjust = 1,size = 2,hjust = 0.5) +
geom_text(aes(y = max_pos, label = max_pos), size = 3,vjust = -.1) +
geom_text(aes(x = Year, y = pos, label = per_cent), vjust = -.4, size = 2)+
xlab("Year") + ylab(" Number of People") + # Set axis labels
ggtitle("Distribution by Category over Year") + # Set title
theme(panel.background =
element_rect(fill = 'white', colour = 'white'),
legend.position = "bottom" ,
legend.title = element_text(color="black",
size=7),
legend.key.width = unit(1,"inch") );
print(p);
now my % on top of number numbers,in other words, it is "17%" and "168" but I want "168" and "17%". I tried switching position of geom_text() but it did not work. I am wondering if you know how to fix it.
Yes it helped. I fixed number to make center of each stack. therefore i needed to make change in percent below code fixed my issue. Thank you so much for your help.
p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
geom_bar(aes(fill = Category), stat="identity") +
geom_text(aes(label=Frequency,y = pos), vjust = 1,
size = 2,hjust = 0.5) +
geom_text(aes(y = max_pos, label = max_pos), size = 3,vjust = -.1) +
geom_text(aes(x = Year, y = pos, label = per_cent), vjust = 1.95,
size = 2,hjust=0.3)+
xlab("Year") + ylab(" Number of People") + # Set axis labels
ggtitle("Distribution by Category over Year") + # Set title;
theme(panel.background =
element_rect(fill = 'white', colour = 'white'),
legend.position = "bottom" ,
legend.title = element_text(color="black",
size=7) );
print(p);

Resources