Changing Colors Grouped Bar Chart ggplot2 - r

I am trying to adjust the colors of my bar chart to be "#054C70" and "#05C3DE" respectively. When I use the following code:
p2 <- ggplot(b, aes(x = Currency, y = amount, color = inbound_outbound)) + geom_bar(position = "dodge", stat = "identity") + labs(title = "Subset Avg. Order Amount") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
+ scale_fill_manual(values = c("#054C70","#05C3DE"))
I get the following error:
Error in +scale_fill_manual(values = c("#054C70", "#05C3DE")) :
invalid argument to unary operator
I am coding in R. Any help would be appreciated. Thank you.

There are a few things going on here.
The + sign at the beginning of your second line of code should be at the end of your first line of code.
If you want to change the colors of the bars themselves (and not just the outline of the bars), you'll want to use the fill mapping (rather than the color mapping.
Using an example from the diamonds dataset, since I don't have the specific dataset you're using,
library(dplyr)
library(ggplot2)
## filter dataset to only have two different colors to best match the example
df <- diamonds %>% filter(color %in% c("D","E"))
## change color to fill in this first line
p2 <- ggplot(df, aes(x = cut, y = price, fill=color)) +
geom_bar(position = "dodge", stat = "identity") +
labs(title = "Subset Avg. Order Amount") +
## make sure the plus sign is at the end of this line
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_fill_manual(values = c("#054C70","#05C3DE"))
This would produce the following plot: example plot

Related

How to display number of cases per group in a stacked bar plot?

I am attempting to produce a stacked bar plot that has the fill color defined by a variable and also shows the number of cases represented by each of the filled sections.
Reproducible example:
library(tidyverse)
data(mpg)
ggplot(mpg,aes(manufacturer))+
geom_bar(position = "fill",stat = "count",aes(fill=drv))+
theme_classic()+
theme(text = element_text(size=20),
axis.text.x = element_text(angle = 45,
vjust = 0.5))
which produces .
Here is a paired-down version of what I would like to produce programmatically:
, where the
n=...
are centered on each groups filled section and display the number of cases per group (drv) in each category (manufacturer).
Additionally, I have tried (unsuccessfully) incorporating code from this post and this post, which seem close to what I want, but when I incorporate the code from this post the following error is thrown:
Error: StatBin requires a continuous x variable: the x variable is discrete.Perhaps you want stat="count"?
I am not sure why this error is thrown because I do define stat="count" in the geom_bar() function call.
Use position_fill(vjust = 0.5) and label with after_stat(count):
ggplot(mpg, aes(manufacturer, fill = drv)) +
geom_bar(position = "fill", stat = "count")+
geom_text(aes(label = paste0("n=", after_stat(count))), stat='count', position = position_fill(vjust = 0.5)) +
theme_classic()

How to change color of data points on a boxplot based on a factored variable using ggplot R

I am trying to make a series of graphs that are based on a binomial variable. I want to add data points to the graph based on a different factored variable with 3 levels. I have been trying to use geom_jitter which worked to put the points on the box plot but I havent been able to change the colors to represent the different levels of the factored variable.
Here is the code I have been using
longg <- ggplot(long, aes(x = mbbase, y= beta)) +
geom_boxplot() + facet_wrap(~test) +
ylab("Beta") +
theme_cleveland() +
scale_fill_viridis(discrete = TRUE, alpha=0.09) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
theme(axis.title.x = element_blank()) +
geom_jitter( size=0.7, alpha=1, width = 0.05)
Here is an example of the graph I want with the mtcars data but instead of a numeric variable as the color id like a factored variable with 3 levels but I only want the color of the data points to change without adding a new box plot for each level of the factored variable
With mtcars, you can try this:
library(ggplot2)
library(dplyr)
library(viridis)
mtcars %>%
# optional: divide the column to color in three. There are more elegant ways
#to do it, but in this way probably it's easier to use it in your data
mutate(new_carb = as.factor(ifelse(carb %in% c(1,2),1,
ifelse(carb %in% c(3,4),2,3)))) %>%
ggplot( aes(x = as.factor(am), y= mpg)) +
geom_boxplot() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
axis.title.x = element_blank()) +
geom_boxplot(outlier.shape=NA) +
# add the color here
geom_jitter(aes(color = new_carb),
size=0.7, alpha=1, width = 0.05) +
scale_color_viridis(discrete = TRUE, alpha=0.09)

ggplot Donut chart is not as desired

I am trying to create a donut chart using ggplot2 with the following data (example).
library(ggplot2)
library(svglite)
library(scales)
# dataframe
Sex = c('Male', 'Female')
Number = c(125, 375)
df = data.frame(Sex, Number)
df
The code I used to generate donut chart is
ggplot(aes(x= Sex, y = Number, fill = Sex), data = df) +
geom_bar(stat = "identity") +
coord_polar("y") +
theme_void() +
theme (legend.position="top") + # legend position
geom_text(aes(label = percent(Number/sum(Number))), position = position_stack(vjust = 0.75), size = 3) +
ggtitle("Participants by Sex")
The above code generated the following chart. Some how not convinced with the chart.
For our purposes, the following chart would better communicate the message. How do I create a chart like this. Where am I doing wrong in my code? I have googled with out any success.
Thanks in advance for help.
They aren't in the same 'circle' because they have different x values. Imagine it as a normal plot first (i.e. without coord_polar("y")) and this will become clear. What you really want is them set at the same x value and then stacked. Here I set x to 2 because it then makes a nicely sized "donut".
donut <- ggplot(df, aes(x = 2, y = Number, fill = Sex)) +
geom_col(position = "stack", width = 1) +
geom_text(aes(label = percent(Number/sum(Number))), position = position_stack(vjust = 0.75), size = 3) +
xlim(0.5, 2.5) +
ggtitle("Participants by Sex")
donut
donut +
coord_polar("y") +
theme_void() +
theme(legend.position="top")

R ggplot2 - displaying values inside a histogram bar [duplicate]

Using ggplot2 1.0.0, I followed the instructions in below post to figure out how to plot percentage bar plots across factors:
Sum percentages for each facet - respect "fill"
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
library(ggplot2)
library(scales)
ggplot(test, aes(x= test2, group = test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
However, I cannot seem to get a label for either the total count or the percentage above each of the bar plots when using geom_text.
What is the correct addition to the above code that also preserves the percentage y-axis?
Staying within ggplot, you might try
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
geom_text(aes( label = format(100*..density.., digits=2, drop0trailing=TRUE),
y= ..density.. ), stat= "bin", vjust = -.5) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
For counts, change ..density.. to ..count.. in geom_bar and geom_text
UPDATE for ggplot 2.x
ggplot2 2.0 made many changes to ggplot including one that broke the original version of this code when it changed the default stat function used by geom_bar ggplot 2.0.0. Instead of calling stat_bin, as before, to bin the data, it now calls stat_count to count observations at each location. stat_count returns prop as the proportion of the counts at that location rather than density.
The code below has been modified to work with this new release of ggplot2. I've included two versions, both of which show the height of the bars as a percentage of counts. The first displays the proportion of the count above the bar as a percent while the second shows the count above the bar. I've also added labels for the y axis and legend.
library(ggplot2)
library(scales)
#
# Displays bar heights as percents with percentages above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
#
# Displays bar heights as percents with counts above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes(label = ..count.., y= ..prop..), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
The plot from the first version is shown below.
This is easier to do if you pre-summarize your data. For example:
library(ggplot2)
library(scales)
library(dplyr)
set.seed(25)
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
# Summarize to get counts and percentages
test.pct = test %>% group_by(test1, test2) %>%
summarise(count=n()) %>%
mutate(pct=count/sum(count))
ggplot(test.pct, aes(x=test2, y=pct, colour=test2, fill=test2)) +
geom_bar(stat="identity") +
facet_grid(. ~ test1) +
scale_y_continuous(labels=percent, limits=c(0,0.27)) +
geom_text(data=test.pct, aes(label=paste0(round(pct*100,1),"%"),
y=pct+0.012), size=4)
(FYI, you can put the labels inside the bar as well, for example, by changing the last line of code to this: y=pct*0.5), size=4, colour="white"))
I've used all of your code and came up with this. First assign your ggplot to a variable i.e. p <- ggplot(...) + geom_bar(...) etc. Then you could do this. You don't need to summarize much since ggplot has a build function that gives you all of this already. I'll leave it to you for the formatting and such. Good luck.
dat <- ggplot_build(p)$data %>% ldply() %>% select(group,density) %>%
do(data.frame(xval = rep(1:6, times = 2),test1 = mapvalues(.$group, from = c(1,2), to = c("a","b")), density = .$density))
p + geom_text(data=dat, aes(x = xval, y = (density + .02), label = percent(density)), colour="black", size = 3)

How to add percentage or count labels above percentage bar plot?

Using ggplot2 1.0.0, I followed the instructions in below post to figure out how to plot percentage bar plots across factors:
Sum percentages for each facet - respect "fill"
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
library(ggplot2)
library(scales)
ggplot(test, aes(x= test2, group = test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
However, I cannot seem to get a label for either the total count or the percentage above each of the bar plots when using geom_text.
What is the correct addition to the above code that also preserves the percentage y-axis?
Staying within ggplot, you might try
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..density.., fill = factor(..x..))) +
geom_text(aes( label = format(100*..density.., digits=2, drop0trailing=TRUE),
y= ..density.. ), stat= "bin", vjust = -.5) +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
For counts, change ..density.. to ..count.. in geom_bar and geom_text
UPDATE for ggplot 2.x
ggplot2 2.0 made many changes to ggplot including one that broke the original version of this code when it changed the default stat function used by geom_bar ggplot 2.0.0. Instead of calling stat_bin, as before, to bin the data, it now calls stat_count to count observations at each location. stat_count returns prop as the proportion of the counts at that location rather than density.
The code below has been modified to work with this new release of ggplot2. I've included two versions, both of which show the height of the bars as a percentage of counts. The first displays the proportion of the count above the bar as a percent while the second shows the count above the bar. I've also added labels for the y axis and legend.
library(ggplot2)
library(scales)
#
# Displays bar heights as percents with percentages above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
#
# Displays bar heights as percents with counts above bars
#
ggplot(test, aes(x= test2, group=test1)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
geom_text(aes(label = ..count.., y= ..prop..), stat= "count", vjust = -.5) +
labs(y = "Percent", fill="test2") +
facet_grid(~test1) +
scale_y_continuous(labels=percent)
The plot from the first version is shown below.
This is easier to do if you pre-summarize your data. For example:
library(ggplot2)
library(scales)
library(dplyr)
set.seed(25)
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
# Summarize to get counts and percentages
test.pct = test %>% group_by(test1, test2) %>%
summarise(count=n()) %>%
mutate(pct=count/sum(count))
ggplot(test.pct, aes(x=test2, y=pct, colour=test2, fill=test2)) +
geom_bar(stat="identity") +
facet_grid(. ~ test1) +
scale_y_continuous(labels=percent, limits=c(0,0.27)) +
geom_text(data=test.pct, aes(label=paste0(round(pct*100,1),"%"),
y=pct+0.012), size=4)
(FYI, you can put the labels inside the bar as well, for example, by changing the last line of code to this: y=pct*0.5), size=4, colour="white"))
I've used all of your code and came up with this. First assign your ggplot to a variable i.e. p <- ggplot(...) + geom_bar(...) etc. Then you could do this. You don't need to summarize much since ggplot has a build function that gives you all of this already. I'll leave it to you for the formatting and such. Good luck.
dat <- ggplot_build(p)$data %>% ldply() %>% select(group,density) %>%
do(data.frame(xval = rep(1:6, times = 2),test1 = mapvalues(.$group, from = c(1,2), to = c("a","b")), density = .$density))
p + geom_text(data=dat, aes(x = xval, y = (density + .02), label = percent(density)), colour="black", size = 3)

Resources