R: How to set breaks of factorial x - axis in ggplot? - r

How can I set the break of the x-axis and drop every second factor? And also how can I adjust hover of ggplotly from factor(Year) to Year?
data <- data.frame (Year = c("2017","2017","2017","2016","2016","2016","2015","2015","2015","2018" ,"2018" ,"2018"),
condition = c("normal","stress","Nitrogen" ,"normal","stress", "Nitrogen","normal","stress","Nitrogen","normal","stress","Nitrogen"),
value = c(22.221268, 1.598309 ,20.560815 ,17.337966,20.440174 , 9.074674, 11.739466, 1.905651, 32.270223, 14.271606 ,12.375446, 17.470793))
library(tidyverse)
data %>%
group_by(Year) %>%
mutate(value = value / sum(value)) %>%
ggplot(aes(fill=condition, y=value, x=factor(Year))) +
geom_col(position="fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_fill_brewer(palette = "Set1")

How to show every other value on a discrete axis is a duplicate of this question. Using my answer from there, we can define the every_nth function. As for the factor(Year) tooltip label, the easiest way to avoid that is to convert the column to factor before plotting, so the aesthetic mapping is simply x = Year.
every_nth = function(n) {
return(function(x) {x[c(TRUE, rep(FALSE, n - 1))]})
}
data %>%
group_by(Year) %>%
mutate(
value = value / sum(value),
Year = factor(Year) ## put this in mutate() before plotting
) %>%
ggplot(aes(fill = condition, y = value, x = Year)) +
geom_col(position = "fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks = every_nth(2)) +
scale_fill_brewer(palette = "Set1") -> p
ggplotly(p)

Related

geom_text percentages labels : how to do that when using facet_wrap or facet_grid?

I have this dataframe that I applied ggplot :
df = data.frame(x =rep(1:5,3),
z = rep(1:3,each = 5),
y = 100:114 )
df
ggplot(df)+aes(x=x,fill=x,y=y)+
geom_col(position = 'dodge')+
facet_wrap(~z)+
geom_text(aes(label = y),
position = position_dodge(1),
vjust=-1,hjust=0,color = 'white' )+
theme_dark()+
scale_fill_gradient(low = 'orange',high = 'red')
However in my figure I want to change frequencies and make them percentages corresponding to each z variable category.
appreciate the help
In cases like this, it's best to pre-calculate the percentages and then plot that directly:
library(tidyverse)
df %>%
group_by(z) %>%
mutate(
y_pct = y / sum(y)
) %>%
ggplot(.)+aes(x=x,fill=x,y=y)+
geom_col(position = 'dodge')+
facet_wrap(~z)+
geom_text(aes(label = sprintf('%0.1f%%', y_pct * 100)),
position = position_dodge(1),
vjust=-1,hjust=0,color = 'white' )+
theme_dark()+
scale_fill_gradient(low = 'orange',high = 'red')

How to automatically change label color depending on relative values (maximum/minimum)?

In order to make a dynamic visualization, for example in a dashboard, I want to display the label colors (percentages or totals) depending on their real values in black or white.
As you can see from my reprex below, I changed the color of the label with the highest percentage manually to black, in order gain a better visability.
Is there a was, to automatically implement the label color? The label with the highest percentage corresponding should always be black, if data is changing over time.
library(ggplot2)
library(dplyr)
set.seed(3)
reviews <- data.frame(review_star = as.character(sample.int(5,400, replace = TRUE)),
stars = 1)
df <- reviews %>%
group_by(review_star) %>%
count() %>%
ungroup() %>%
mutate(perc = `n` / sum(`n`)) %>%
arrange(perc) %>%
mutate(labels = scales::percent(perc))
ggplot(df, aes(x = "", y = perc, fill = review_star)) +
geom_col(color = "black") +
geom_label(aes(label = labels), color = c( "white", "white","white",1,"white"),
position = position_stack(vjust = 0.5),
show.legend = FALSE) +
guides(fill = guide_legend(title = "Answer")) +
scale_fill_viridis_d() +
coord_polar(theta = "y") +
theme_void()
you can set the colors using replace(rep('white', nrow(df)), which.max(df$perc), 'black').
ggplot(df, aes(x = "", y = perc, fill = review_star)) +
geom_col(color = "black") +
geom_label(aes(label = labels),
color = replace(rep('white', nrow(df)), which.max(df$perc), 'black'),
position = position_stack(vjust = 0.5),
show.legend = FALSE) +
guides(fill = guide_legend(title = "Answer")) +
scale_fill_viridis_d() +
coord_polar(theta = "y") +
theme_void()

Horizontal percent total stacked bar chart with labels on each end

I have a simple data frame which has the probabilities that an id is real and fake, respectively:
library(tidyverse)
dat <- data.frame(id = "999", real = 0.7, fake = 0.3)
I know that I can show this as a horizontal bar chart using the code below:
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_bar(stat = "identity") +
coord_flip()
But I was wondering if there was a way to show this in the same way as shown below, with the class labels and probabilities on either end of the bar chart?
Many thanks
A straight forward, maybe somewhat cheeky workaround is to re-define your 0.
I added a few calls that are not strictly necessary, but make it look closer to your example plot.
library(tidyverse)
dat <- data.frame(id = "999", real = -0.7, fake = 0.3) # note the minus sign!
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_col(show.legend = FALSE) +
geom_text(aes(label = stringr::str_to_title(paste0(grp, " (", as.character(100*abs(prob)), "%)"))),
hjust = c(1,0))+
coord_flip(clip = "off") +
scale_fill_brewer(palette = "Greys") +
theme_void() +
theme(aspect.ratio = .1,
plot.margin = margin(r = 3, l = 3, unit = "lines"))
Created on 2021-02-06 by the reprex package (v0.3.0)
I'm not sure this fully answers the question but I think it will improve the plot, can you try it out?
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous("Proportion") +
scale_x_discrete("", expand = c(0,0)) +
scale_fill_identity() +
coord_flip()

ggplot2: show relative % in a stacked barplot per group

I'm trying to plot a basic bar chart per group.
As values are pretty big, I want to show for each bar (i.e. group) the % of each group within the bar.
I managed to show percentage of the total, but this is not what I'm expecting : in each bar, I would like that the sum of % equal 100%.
Is there an easy way to do it without changing the dataframe ?
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(aes(x = year, y = Value, group = Grp,
label = percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
You can create a new variable for percentile by year:
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, ValuePer, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(ValuePer)),
position = position_fill())+
scale_y_continuous(labels = percent_format())
Use position = "fill" to turn scale into proportions and scale_y_continuous(labels = percent_format()) to turn this scale into percent.
DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000)))
library(ggplot2)
library(scales)
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(Value / sum(Value))),
position = position_fill()) +
scale_y_continuous(labels = percent_format())
OK gathering all your tricks, I finally get this :
I need to adjust my DF, what I wanted to avoid, but it remains simple so it works
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "stack") +
geom_text(aes(label = percent(ValuePer)),
position = position_stack()) +
scale_y_continuous(labels = unit_format("M", 1e-6) )
I would use a single geom_text for each bar while filtering data by year (bar) using dplyr. Check if is that what you need:
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
library(dplyr)
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(data = DF %>% filter(year == 2015),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2016),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2017),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
Argument group is not necessary here. There may be more elegant solutions but that is the one I could think about. Tell me if this is the output you were waiting for:
Maybe creating a new column doing the right computation. I could not figure out how the computation could be done right inside aes(), the way you did you just computed the overall %, the Value should be grouped by year instead.
At least you got yourself the actually value by the Y axis and the Year grouped % inside bars. I would advise changing this labels by stacking something like this:
scale_y_continuous(breaks = seq(0,8*10^6,10^6),
labels = c(0, paste(seq(1,8,1),'M')))
Resulting this:
You can adapt to your context.

Combining position_dodge and position_fill in ggplot2

What I would like to do is use both the position = "fill" and the position = "dodge" arguments of geom_bar() at the same time somehow. Using some sample data
set.seed(1234)
df <- data.frame(
Id = rep(1:10, each = 12),
Month = rep(1:12, times = 10),
Value = sample(1:2, 10 * 12, replace = TRUE)
)
I'm able to create the following graph
df.plot <- ggplot(df, aes(x = as.factor(Month), fill = as.factor(Value))) +
geom_bar(position = "fill") +
scale_x_discrete(breaks = 1:12) +
scale_y_continuous(labels = percent) +
labs(x = "Month", y = "Value")
I like the scaling and labeling of this graph but I want to be able to unstack it. However when I do the following
df.plot2 <- ggplot(df, aes(x = as.factor(Month), fill = as.factor(Value))) +
geom_bar(position = "dodge", aes(y = (..count..)/sum(..count..))) +
scale_x_discrete(breaks = 1:12) +
scale_y_continuous(labels = percent) +
labs(x = "Month", y = "Value")
The bars are in the position and scaling that I want but the y-axis labels represent the percentage of each bar relative to the total count, not the count within each month.
All in all I want the visuals of the second graph with the labeling of the first graph. Is there a relatively easy way to automate this?
Expanding on my comment:
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
df1 <- df %>%
group_by(Month) %>%
summarise(Value1 = sum(Value == 1) / n(),
Value2 = sum(Value == 2) / n()) %>%
gather(key = Group,value = Val,Value1:Value2)
df.plot2 <- ggplot(df1, aes(x = as.factor(Month),
y = Val,
fill = as.factor(Group))) +
geom_bar(position = "dodge",stat = "identity") +
scale_y_continuous(labels = percent_format()) +
scale_x_discrete(breaks = 1:12) +
labs(x = "Month", y = "Value")

Resources