R percent labels on pie chart [duplicate] - r

This question already has answers here:
pie chart with ggplot2 with specific order and percentage annotations
(2 answers)
Closed 5 years ago.
I'm trying to add some percent labels to a pie chart but any of the solutions works. The thing is that the chart displays the number of tasks completed grouped by category.
output$plot2<-renderPlot({
ggplot(data=data[data$status=='100% completed',], aes(x=factor(1), fill=category))+
geom_bar(width = 1)+
coord_polar("y")

Using geom_text with position_stack to adjust the label locations would work.
library(ggplot2)
library(dplyr)
# Create a data frame which is able to replicate your plot
plot_frame <- data.frame(category = c("A", "B", "B", "C"))
# Get counts of categories
plot_frame <- plot_frame %>%
group_by(category) %>%
summarise(counts = n()) %>%
mutate(percentages = counts/sum(counts)*100)
# Plot
ggplot(plot_frame, aes(x = factor(1), y = counts)) +
geom_col(aes(fill = category), width = 1) +
geom_text(aes(label = percentages), position = position_stack(vjust = 0.5)) +
coord_polar("y")
The codes above generate this:
You might want to change the y-axis from counts to percentages since you are labeling the latter. In that case, change the values passed to ggplot accordingly.

Related

GGplot: Two stacked bar plots side by side (not facets)

I am trying to recreate this solution using ggplot2 in R: Combining two stacked bar plots for a grouped stacked bar plot
diamonds %>%
filter(color=="D"|color=="E"|color=="F") %>%
mutate(dummy=rep(c("a","b"),each=13057)) %>%
ggplot(aes(x=color,y=price))+
geom_bar(aes(fill=clarity),stat="identity",width=.25)+
facet_wrap(~cut)
I added a new variable to the diamonds dataset called dummy. dummy has two values: a and b. Let's say I want to compare these two values by creating a bar graph that has two stacked bars right next to each other (one for each value of dummy) for each value of color. How can I manipulate this such that there are two stacked bars for each value of color?
I think it would involve position dodge and/or a separate legend, but I've been unsuccessful so far. I do not want to add another facet - I want these both on the x-axis within each facet.
Similiar to the approach in the post you have linked one option to achieve your desired result would be via two geom_col and by converting the x axis variable to a numeric like so. However, doing so requires to set the breaks and labels manually via scale_x_continuous. Additionally I made use of the ggnewscale package to add a second fill scale:
library(ggplot2)
library(dplyr)
d <- diamonds %>%
filter(color == "D" | color == "E" | color == "F") %>%
mutate(dummy = rep(c("a", "b"), each = 13057))
ggplot(mapping = aes(y = price)) +
geom_col(data = filter(d, dummy == "a"), aes(x = as.numeric(color) - .15, fill = clarity), width = .3) +
scale_fill_viridis_d(name = "a", guide = guide_legend(order = 1)) +
scale_x_continuous(breaks = seq_along(levels(d$color)), labels = levels(d$color)) +
ggnewscale::new_scale_fill() +
geom_col(data = filter(d, dummy == "b"), aes(x = as.numeric(color) + .15, fill = clarity), width = .3) +
scale_fill_viridis_d(name = "b", option = "B", guide = guide_legend(order = 2)) +
facet_wrap(~cut)

Adding labels to individual % inside geom_bar() using R / ggplot2 [duplicate]

This question already has answers here:
Add percentage labels to a stacked barplot
(2 answers)
Closed 3 years ago.
bgraph <- ggplot(data = data, aes(x = location)) +
geom_bar(aes(fill = success))
success is a percentage calculated as a factor of 4 categories with the varying 4 outcomes of the data set. I could separately calculate them easily, but as the ggplot is currently constituted, they are generated by the geom_bar(aes(fill=success)).
data <- as.data.frame(c(1,1,1,1,1,1,2,2,3,3,3,3,4,4,4,4,4,4,
4,4,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7))
data[["success"]] <- c("a","b","c","c","d","d","a","b","b","b","c","d",
"a","b","b","b","c","c","c","d","a","b","c","d",
"a","b","c","c","d","d","a","b","b","c","d")
names(data) <- c("location","success")
bgraph <- ggplot(data = data, aes(x = location)) +
geom_bar(aes(fill = success))
bgraph
How do I get labels over the individual percentages? More specifically, I wanted 4 individual percentages for each bar. One for yellow, light orange, orange, and red, respectively. %'s all add up to 1.
Maybe there is a way to do this in ggplot directly but with some pre-processing in dplyr, you'll be able to achieve your desired output.
library(dplyr)
library(ggplot2)
data %>%
count(location, success) %>%
group_by(location) %>%
mutate(n = n/sum(n) * 100) %>%
ggplot() + aes(x = location, n, fill = success,label = paste0(round(n, 2), "%")) +
geom_bar(stat = "identity") +
geom_text(position=position_stack(vjust=0.5))
How about creating a summary frame with the relative frequencies within location and then using that with geom_col() and geom_text()?
# Create summary stats
tots <-
data %>%
group_by(location,success) %>%
summarise(
n = n()
) %>%
mutate(
rel = round(100*n/sum(n)),
)
# Plot
ggplot(data = tots, aes(x = location, y = n)) +
geom_col(aes(fill = fct_rev(success))) + # could only get it with this reversed
geom_text(aes(label = rel), position = position_stack(vjust = 0.5))
OUTPUT:

How to standardise colours of bars in stacked bar charts? [duplicate]

This question already has an answer here:
Manually setting group colors for ggplot2
(1 answer)
Closed 3 years ago.
So I am plotting a number of stacked bar charts on antibiotic use. The antibiotics included in each chart will differ between each individual dataset being plotted. So for example, in chart 1, 'antibiotic x' may be plotted as red, but in chart 2, it may be plotted as orange.
I was wondering if there is any way in which to standardise the colouring so that antibiotic x is the same colour across all charts I am plotting? This way it would make it easier to visualise if all antibiotic classes were the same across all charts.
Some example code for what I have used to plot one of the stacked bar charts is as followed;
datastack016 %>%
mutate(Date = dmy(Date),
Q = quarter(Date, with_year = TRUE)) %>%
group_by(Q) %>%
summarise_if(is.numeric, sum, na.rm = TRUE) %>%
gather(Key, Total, -Q) %>%
ggplot(aes(Q, Total, fill = Key)) +
geom_bar(stat = "identity") +
scale_x_yearqtr(format = "%Y") +
ylab("Antibiotic Total (Grams)") +
xlab("Date (Quarters/Year)")`
Any help would be much appreciated! :)
You could define the colours as a named vector and pass that to scale_fill_manual. See code below (just substitute scale_colour_manual with scale_fill_manual):
cols <- setNames(c("dodgerblue", "limegreen", "tomato"),
levels(iris$Species))
ggplot(iris, aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point() +
scale_colour_manual(values = cols)
And now without one of the factors:
ggplot(iris[iris$Species != "setosa", ],
aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point() +
scale_colour_manual(values = cols)
Note that cols still contains dodgerblue, but this is dropped from the legend since we don't have the first group (setosa).

How to manually change grouping of just stacked portion of a stacked + grouped ggplot bar-chart [duplicate]

This question already has answers here:
Order categorical data in a stacked bar plot with ggplot2
(3 answers)
Closed 5 years ago.
Im following the example given in this post for creating a grouped and stacked bar-chart: How to produce stacked bars within grouped barchart in R
From that post:
library(reshape2)
library(dplyr)
library(ggplot2)
test <- data.frame(person=c("A", "B", "C", "D", "E"),
value1=c(100,150,120,80,150),
value2=c(25,30,45,30,30) ,
value3=c(100,120,150,150,200))
melted <- melt(test, "person")
melted$cat <- ''
melted[melted$variable == 'value1',]$cat <- "first"
melted[melted$variable != 'value1',]$cat <- "second"
ggplot(melted, aes(x = cat, y = value, fill = variable)) +
geom_bar(stat = 'identity', position = 'stack') +
facet_grid(~ person)
As it is above, the plot orders value2 on top of value3.
What I am trying to do is change the order of the stacked portion,ie; I'd like to place value3 on top of value2.
Ive tried manually changing the order of the variable:
melted2 <- melted %>%
arrange(desc(variable))
ggplot(melted2, aes(x = cat, y = value, fill = variable)) +
geom_bar(stat = 'identity', position = 'stack') +
facet_grid(~ person)
But the plot output looks identical to the first one. Essentially, the reordering of the input data does not accomplish the task.
Thank you in advance!
This should work though it isn't clear to me exactly what order you'd like these in. But you can use levels to accomplish this:
melted$variable <- factor(melted$variable, levels = c("value1","value3","value2"))
ggplot(melted, aes(x = cat, y = value, fill = variable)) +
geom_bar(stat = 'identity', position = 'stack') +
facet_grid(~ person)
This approach that you use before only arranges the values in order of their value.
melted2 <- melted %>%
arrange(desc(variable))
Works better for continuous vectors. You actually needs to change the levels of the factor. You can check the levels using this:
levels(melted$variable)
[1] "value1" "value3" "value2"
melted$variable was already a factor but you just needed to override the default levels to what you wanted.

How to center stacked percent barchart labels

I am trying to plot nice stacked percent barchart using ggplot2. I've read some material and almost manage to plot, what I want. Also, I enclose the material, it might be useful in one place:
How do I label a stacked bar chart in ggplot2 without creating a summary data frame?
Create stacked barplot where each stack is scaled to sum to 100%
R stacked percentage bar plot with percentage of binary factor and labels (with ggplot)
My problem is that I can't place labels where I want - in the middle of the bars.
You can see the problem in the picture above - labels looks awfull and also overlap each other.
What I am looking for right now is:
How to place labels in the midde of the bars (areas)
How to plot not all the labels, but for example which are greather than 10%?
How to solve overlaping problem?
For the Q 1. #MikeWise suggested possible solution. However, I still can't deal with this problem.
Also, I enclose reproducible example, how I've plotted this grahp.
library('plyr')
library('ggplot2')
library('scales')
set.seed(1992)
n=68
Category <- sample(c("Black", "Red", "Blue", "Cyna", "Purple"), n, replace = TRUE, prob = NULL)
Brand <- sample("Brand", n, replace = TRUE, prob = NULL)
Brand <- paste0(Brand, sample(1:5, n, replace = TRUE, prob = NULL))
USD <- abs(rnorm(n))*100
df <- data.frame(Category, Brand, USD)
# Calculate the percentages
df = ddply(df, .(Brand), transform, percent = USD/sum(USD) * 100)
# Format the labels and calculate their positions
df = ddply(df, .(Brand), transform, pos = (cumsum(USD) - 0.5 * USD))
#create nice labes
df$label = paste0(sprintf("%.0f", df$percent), "%")
ggplot(df, aes(x=reorder(Brand,USD,
function(x)+sum(x)), y=percent, fill=Category))+
geom_bar(position = "fill", stat='identity', width = .7)+
geom_text(aes(label=label, ymax=100, ymin=0), vjust=0, hjust=0,color = "white", position=position_fill())+
coord_flip()+
scale_y_continuous(labels = percent_format())+
ylab("")+
xlab("")
Here's how to center the labels and avoid plotting labels for small percentages. An additional issue in your data is that you have multiple bar sections for each colour. Instead, it seems to me all the bar sections of a given colour should be combined. The code below uses dplyr instead of plyr to set up the data for plotting:
library(dplyr)
# Initial data frame
df <- data.frame(Category, Brand, USD)
# Calculate percentages
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate(percent = USD/sum(USD))
With ggplot2 version 2, it is no longer necessary to calculate the coordinates of the text labels to get them centered. Instead, you can use position=position_stack(vjust=0.5). For example:
ggplot(df.summary, aes(x=reorder(Brand, USD, sum), y=percent, fill=Category)) +
geom_bar(stat="identity", width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),"")),
position=position_stack(vjust=0.5), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
With older versions, we need to calculate the position. (Same as above, but with an extra line defining pos):
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate(percent = USD/sum(USD),
pos = cumsum(percent) - 0.5*percent)
Then plot the data using an ifelse statement to determine whether a label is plotted or not. In this case, I've avoided plotting a label for percentages less than 7%.
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=percent, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),""),
y=pos), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
I followed the example and found the way how to put nice labels for simple stacked barchart. I think it might be usefull too.
df <- data.frame(Category, Brand, USD)
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate( pos = cumsum(USD)-0.5*USD)
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=USD, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(USD>100,round(USD,0),""),
y=pos), colour="white") +
coord_flip()+
labs(y="", x="")

Resources