I am trying to create a donut chart using ggplot2 with the following data (example).
library(ggplot2)
library(svglite)
library(scales)
# dataframe
Sex = c('Male', 'Female')
Number = c(125, 375)
df = data.frame(Sex, Number)
df
The code I used to generate donut chart is
ggplot(aes(x= Sex, y = Number, fill = Sex), data = df) +
geom_bar(stat = "identity") +
coord_polar("y") +
theme_void() +
theme (legend.position="top") + # legend position
geom_text(aes(label = percent(Number/sum(Number))), position = position_stack(vjust = 0.75), size = 3) +
ggtitle("Participants by Sex")
The above code generated the following chart. Some how not convinced with the chart.
For our purposes, the following chart would better communicate the message. How do I create a chart like this. Where am I doing wrong in my code? I have googled with out any success.
Thanks in advance for help.
They aren't in the same 'circle' because they have different x values. Imagine it as a normal plot first (i.e. without coord_polar("y")) and this will become clear. What you really want is them set at the same x value and then stacked. Here I set x to 2 because it then makes a nicely sized "donut".
donut <- ggplot(df, aes(x = 2, y = Number, fill = Sex)) +
geom_col(position = "stack", width = 1) +
geom_text(aes(label = percent(Number/sum(Number))), position = position_stack(vjust = 0.75), size = 3) +
xlim(0.5, 2.5) +
ggtitle("Participants by Sex")
donut
donut +
coord_polar("y") +
theme_void() +
theme(legend.position="top")
Related
I want to separately plot data in a bubble plot like the image right (I make this in PowerPoint just to visualize).
At the moment I can only create a plot that looks like in the left where the bubble are overlapping. How can I do this in R?
b <- ggplot(df, aes(x = Year, y = Type))
b + geom_point(aes(color = Spp, size = value), alpha = 0.6) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(0.5, 12))
You can have the use of position_dodge() argument in your geom_point. If you apply it directly on your code, it will position points in an horizontal manner, so the idea is to switch your x and y variables and use coord_flip to get it in the right way:
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6, position = position_dodge(0.9)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(1, 15)) +
coord_flip()
Does it look what you are trying to achieve ?
EDIT: Adding text in the middle of each points
To add labeling into each point, you can use geom_text and set the same position_dodge2 argument than for geom_point.
NB: I use position_dodge2 instead of position_dodge and slightly change values of width because I found position_dodge2 more adapted to this case.
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6,
position = position_dodge2(width = 1)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(3, 15)) +
coord_flip()+
geom_text(aes(label = Value, group = Group),
position = position_dodge2(width = 1))
Reproducible example
As you did not provide a reproducible example, I made one that is maybe not fully representative of your original dataset. If my answer is not working for you, you should consider providing a reproducible example (see here: How to make a great R reproducible example)
Group <- c(LETTERS[1:3],"A",LETTERS[1:2],LETTERS[1:3])
Year <- c(rep(1918,4),rep(2018,5))
Type <- c(rep("PP",3),"QQ","PP","PP","QQ","QQ","QQ")
Value <- sample(1:50,9)
df <- data.frame(Group, Year, Value, Type)
df$Type <- factor(df$Type, levels = c("PP","QQ"))
One of the value in my dataset is zero, I think because of that I am not able to adjust labels correctly in my pie chart.
#Providing you all a sample dataset
Averages <- data.frame(Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"), Values = c(15.00,2.81,50.84,51.86,0.00))
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
duty_cycle_pie <- Averages %>% ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0)+
geom_text(aes(y = cumsum(Values) - 0.7*Values,label = round(Values*100/sum(Values),2)), color = "white")+
scale_fill_manual(values = mycols)
Labels are not placed in the correct way. Please tell me how can get 3D piechart.
Welcome to stackoverflow. I am happy to help, however, I must note that piecharts are highly debatable and 3D piecharts are considered bad practice.
https://www.darkhorseanalytics.com/blog/salvaging-the-pie
https://en.wikipedia.org/wiki/Misleading_graph#3D_Pie_chart_slice_perspective
Additionally, if the names of your variables reflect your actual dataset (Averages), a piechart would not be appropriate as the pieces do not seem to be describing parts of a whole. Ex: avg value of Bicycle is 50.84 and avg value of Airplane is 51.86. Having these result in 43% and 42% is confusing; a barchart would be easier to follow.
Nonetheless, the answer to your question about placement can be solved with position_stack().
library(tidyverse)
Averages <-
data.frame(
Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"),
Values = c(15.00,2.81,50.84,51.86,0.00)
) %>%
mutate(
# this will ensure the slices go biggest to smallest (a best practice)
Parameters = fct_reorder(Parameters, Values),
label = round(Values/sum(Values) * 100, 2)
)
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
Averages %>%
ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0) +
geom_text(
aes(y = Values, label = label),
color = "black",
position = position_stack(vjust = 0.5)
) +
scale_fill_manual(values = mycols)
To move the pieces towards the outside of the pie, you can look into ggrepel
https://stackoverflow.com/a/44438500/4650934
For my earlier point, I might try something like this instead of a piechart:
ggplot(Averages, aes(Parameters, Values)) +
geom_col(aes(y = 100), fill = "grey70") +
geom_col(fill = "navyblue") +
coord_flip()
I am trying to adjust the colors of my bar chart to be "#054C70" and "#05C3DE" respectively. When I use the following code:
p2 <- ggplot(b, aes(x = Currency, y = amount, color = inbound_outbound)) + geom_bar(position = "dodge", stat = "identity") + labs(title = "Subset Avg. Order Amount") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
+ scale_fill_manual(values = c("#054C70","#05C3DE"))
I get the following error:
Error in +scale_fill_manual(values = c("#054C70", "#05C3DE")) :
invalid argument to unary operator
I am coding in R. Any help would be appreciated. Thank you.
There are a few things going on here.
The + sign at the beginning of your second line of code should be at the end of your first line of code.
If you want to change the colors of the bars themselves (and not just the outline of the bars), you'll want to use the fill mapping (rather than the color mapping.
Using an example from the diamonds dataset, since I don't have the specific dataset you're using,
library(dplyr)
library(ggplot2)
## filter dataset to only have two different colors to best match the example
df <- diamonds %>% filter(color %in% c("D","E"))
## change color to fill in this first line
p2 <- ggplot(df, aes(x = cut, y = price, fill=color)) +
geom_bar(position = "dodge", stat = "identity") +
labs(title = "Subset Avg. Order Amount") +
## make sure the plus sign is at the end of this line
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_fill_manual(values = c("#054C70","#05C3DE"))
This would produce the following plot: example plot
Using ggplot2 1.0.0, I followed the instructions in below post to figure out how to plot percentage bar plots across factors:
Sum percentages for each facet - respect "fill"
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
library(ggplot2)
library(scales)
ggplot(test, aes(x= test2, group = test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
However, I cannot seem to get a label for either the total count or the percentage above each of the bar plots when using geom_text.
What is the correct addition to the above code that also preserves the percentage y-axis?
Staying within ggplot, you might try
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
geom_text(aes( label = format(100*..density.., digits=2, drop0trailing=TRUE),
y= ..density.. ), stat= "bin", vjust = -.5) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
For counts, change ..density.. to ..count.. in geom_bar and geom_text
UPDATE for ggplot 2.x
ggplot2 2.0 made many changes to ggplot including one that broke the original version of this code when it changed the default stat function used by geom_bar ggplot 2.0.0. Instead of calling stat_bin, as before, to bin the data, it now calls stat_count to count observations at each location. stat_count returns prop as the proportion of the counts at that location rather than density.
The code below has been modified to work with this new release of ggplot2. I've included two versions, both of which show the height of the bars as a percentage of counts. The first displays the proportion of the count above the bar as a percent while the second shows the count above the bar. I've also added labels for the y axis and legend.
library(ggplot2)
library(scales)
#
# Displays bar heights as percents with percentages above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
#
# Displays bar heights as percents with counts above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes(label = ..count.., y= ..prop..), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
The plot from the first version is shown below.
This is easier to do if you pre-summarize your data. For example:
library(ggplot2)
library(scales)
library(dplyr)
set.seed(25)
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
# Summarize to get counts and percentages
test.pct = test %>% group_by(test1, test2) %>%
summarise(count=n()) %>%
mutate(pct=count/sum(count))
ggplot(test.pct, aes(x=test2, y=pct, colour=test2, fill=test2)) +
geom_bar(stat="identity") +
facet_grid(. ~ test1) +
scale_y_continuous(labels=percent, limits=c(0,0.27)) +
geom_text(data=test.pct, aes(label=paste0(round(pct*100,1),"%"),
y=pct+0.012), size=4)
(FYI, you can put the labels inside the bar as well, for example, by changing the last line of code to this: y=pct*0.5), size=4, colour="white"))
I've used all of your code and came up with this. First assign your ggplot to a variable i.e. p <- ggplot(...) + geom_bar(...) etc. Then you could do this. You don't need to summarize much since ggplot has a build function that gives you all of this already. I'll leave it to you for the formatting and such. Good luck.
dat <- ggplot_build(p)$data %>% ldply() %>% select(group,density) %>%
do(data.frame(xval = rep(1:6, times = 2),test1 = mapvalues(.$group, from = c(1,2), to = c("a","b")), density = .$density))
p + geom_text(data=dat, aes(x = xval, y = (density + .02), label = percent(density)), colour="black", size = 3)
Using ggplot2 1.0.0, I followed the instructions in below post to figure out how to plot percentage bar plots across factors:
Sum percentages for each facet - respect "fill"
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
library(ggplot2)
library(scales)
ggplot(test, aes(x= test2, group = test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
However, I cannot seem to get a label for either the total count or the percentage above each of the bar plots when using geom_text.
What is the correct addition to the above code that also preserves the percentage y-axis?
Staying within ggplot, you might try
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
geom_text(aes( label = format(100*..density.., digits=2, drop0trailing=TRUE),
y= ..density.. ), stat= "bin", vjust = -.5) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
For counts, change ..density.. to ..count.. in geom_bar and geom_text
UPDATE for ggplot 2.x
ggplot2 2.0 made many changes to ggplot including one that broke the original version of this code when it changed the default stat function used by geom_bar ggplot 2.0.0. Instead of calling stat_bin, as before, to bin the data, it now calls stat_count to count observations at each location. stat_count returns prop as the proportion of the counts at that location rather than density.
The code below has been modified to work with this new release of ggplot2. I've included two versions, both of which show the height of the bars as a percentage of counts. The first displays the proportion of the count above the bar as a percent while the second shows the count above the bar. I've also added labels for the y axis and legend.
library(ggplot2)
library(scales)
#
# Displays bar heights as percents with percentages above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
#
# Displays bar heights as percents with counts above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes(label = ..count.., y= ..prop..), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
The plot from the first version is shown below.
This is easier to do if you pre-summarize your data. For example:
library(ggplot2)
library(scales)
library(dplyr)
set.seed(25)
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
# Summarize to get counts and percentages
test.pct = test %>% group_by(test1, test2) %>%
summarise(count=n()) %>%
mutate(pct=count/sum(count))
ggplot(test.pct, aes(x=test2, y=pct, colour=test2, fill=test2)) +
geom_bar(stat="identity") +
facet_grid(. ~ test1) +
scale_y_continuous(labels=percent, limits=c(0,0.27)) +
geom_text(data=test.pct, aes(label=paste0(round(pct*100,1),"%"),
y=pct+0.012), size=4)
(FYI, you can put the labels inside the bar as well, for example, by changing the last line of code to this: y=pct*0.5), size=4, colour="white"))
I've used all of your code and came up with this. First assign your ggplot to a variable i.e. p <- ggplot(...) + geom_bar(...) etc. Then you could do this. You don't need to summarize much since ggplot has a build function that gives you all of this already. I'll leave it to you for the formatting and such. Good luck.
dat <- ggplot_build(p)$data %>% ldply() %>% select(group,density) %>%
do(data.frame(xval = rep(1:6, times = 2),test1 = mapvalues(.$group, from = c(1,2), to = c("a","b")), density = .$density))
p + geom_text(data=dat, aes(x = xval, y = (density + .02), label = percent(density)), colour="black", size = 3)