Horizontal bar graph with ggplot: Fixed distance of bars - r

This question is related to the one I have already asked here: Properly align country names and values for horizontal bar graph in ggplot
I would like to produce the following bar graph, but want to make sure that the distance from the beginning of the country name to the bar graph is always the same. So no matter if I read in the first or second df, it should always be the same distance as here:
#df1
loooooong country1 100% Bar
looooong country2 99% Bar
#df2
short country1 100% Bar
short country2 99% Bar
As it is for now, the distance between the end of the country name and the bar is always the same. I have found a workaround with filling up country names with spaces and use monospace font, but this looks pretty bad.. :)
library(ggplot2)
library(dplyr)
### first df
df <- data.frame(
info_country = c("country1", "country loooooooong name", "country2", "country middle name", "country3"),
indicator = c(50,100,50,50,5))
### second df
# df <- data.frame(
# info_country = c("country1", "country3", "country2", "country4", "country5"),
# indicator = c(50,100,50,50,5))
### change factor level for ggplot order
df$info_country <- factor(df$info_country, levels = df$info_country[order(df$indicator)])
factor(df$info_country)
### create bar graph
bar_graph <- df %>%
ggplot( aes(x = info_country, y = indicator)) +
geom_bar(stat = "identity", width = 0.8, fill = "#EE5859") +
geom_text(aes(y = -2, label = paste(indicator, "%", sep=" ")),
hjust = 1, size = 11 * 0.8 / ggplot2::.pt, color = "grey30") +
xlab("") +
ylab("") +
scale_y_continuous(labels = NULL, limits = c(-2, 100)) +
# Use clip = "off" to prevent that percentage labels are clipped off
coord_flip(clip = "off") +
theme(
panel.background = element_rect(fill = "white", colour = NA),
# Set color of ticks to NA
axis.ticks.x = element_line(color=NA),
axis.ticks.y = element_line(color=NA),
# Increase the margin
axis.text.y = element_text(hjust=0, margin = margin(r = 6, unit = "cm")),
axis.text.x = element_text(hjust=0),
)
bar_graph

I would simply repeat the trick of plotting text as if it were axis labels. You can control the distance between the left edge of the labels and the start of the bars by setting the labels' hjust to 0 and using a large negative number for y in their aesthetic. A value of -100 is nicely symmetrical:
df %>%
ggplot( aes(x = info_country, y = indicator)) +
geom_bar(stat = "identity", width = 0.8, fill = "#EE5859") +
geom_text(aes(y = -2, label = paste(indicator, "%", sep=" ")),
hjust = 1, size = 11 * 0.8 / .pt, color = "grey30") +
geom_text(aes(y = -100, label = info_country),
hjust = 0, size = 11 * 0.8 / .pt, color = "grey30") +
labs(x = "", y = "") +
scale_y_continuous(labels = NULL, limits = c(-100, 100)) +
coord_flip(clip = "off") +
theme(panel.background = element_rect(fill = "white", colour = NA),
axis.ticks.x = element_line(color = NA),
axis.ticks.y = element_line(color = NA),
axis.text.y = element_blank())

Related

ggplot: change color of bars and not show all labels in legend

I have not been working with r for long, but have already found many answers to my questions in this community. But now I can't get any further and ask my first own question.
Objective: I want to display values from different years (here in the example 10 years) over time in a barplot. Each year should be represented by a column. The years 1 to 9 should get a uniform color, the 10th year another. For the 10th year the value should also be displayed. There should be only two entries in the legend: "Year 1 - 9" and "Year 10".
I have created the following dummy data set:
library(ggplot2)
# texts 2 display
tit <- "Title"
subtit <- "Subtitle"
lab <- c("lab1", "lab2", "lab3", "lab4")
# prepare dataset with random data
n_label <- length(lab)
cohort <-
c(
rep("year01", n_label),
rep("year02", n_label),
rep("year03", n_label),
rep("year04", n_label),
rep("year05", n_label),
rep("year06", n_label),
rep("year07", n_label),
rep("year08", n_label),
rep("year09", n_label),
rep("year10", n_label)
)
data_rel <- runif(40, min = 0, max = .5)
df_data <- data.frame(lab, cohort, data_rel)
df_data %>% summarise(count = n_distinct(cohort)) -> n_cohort
I was able to implement the plot as desired with the following code:
# plot data
df_data %>%
ggplot() +
geom_bar (aes(
x = factor(lab, levels = c("lab1", "lab2", "lab3", "lab4")),
y = data_rel,
fill = cohort
),
stat = "identity",
position = position_dodge()) +
scale_y_continuous(labels = scales::percent, limits = c(0, 1)) +
theme_classic() +
theme(
legend.position = "bottom",
plot.title = element_text(hjust = 0.5,
size = 14,
face = "bold"),
plot.subtitle = element_text(hjust = 0.5),
plot.caption = element_text(hjust = 0.5),
) +
geom_text(
data = subset(df_data, cohort == "year10"),
aes(
x = lab,
y = data_rel,
label = paste0(sprintf("%.1f", data_rel * 100), "%")
),
vjust = -1,
hjust = -1.5,
size = 3
) +
scale_fill_manual(
values = c("#7F7F7F", "#389DC3"),
limits = c("year01", "year10"),
labels = c("Year 1 - 9", "Year 10")
) +
labs(
subtitle = paste(subtit),
title = str_wrap(tit, 45),
x = "",
y = "",
fill = ""
)
Unfortunately, I cannot adjust the colors of the columns for years 1 - 9. Either not all columns get the correct color, or I get unwanted entries in the legend.
Does anyone have an idea what i am doing wrong? I am grateful for every hint!
In setting the fill attribute you can group all other levels of the factor together (here using forcats::fct_other to collapse Years 1-9 into one level) to give your two levels of fill colours. At the same time, using group = cohort will keep bars separate:
library(forcats)
# plot data
df_data %>%
ggplot() +
geom_bar (aes(
x = factor(lab, levels = c("lab1", "lab2", "lab3", "lab4")),
y = data_rel,
group = cohort,
fill = fct_other(cohort, "year10", other_level = "year01")
),
stat = "identity",
position = position_dodge()) +
scale_y_continuous(labels = scales::percent, limits = c(0, 1)) +
theme_classic() +
theme(
legend.position = "bottom",
plot.title = element_text(hjust = 0.5,
size = 14,
face = "bold"),
plot.subtitle = element_text(hjust = 0.5),
plot.caption = element_text(hjust = 0.5),
) +
geom_text(
data = subset(df_data, cohort == "year10"),
aes(
x = lab,
y = data_rel,
label = paste0(sprintf("%.1f", data_rel * 100), "%")
),
vjust = -1,
hjust = -1.5,
size = 3
) +
scale_fill_manual(
values = c("#7822CC", "#389DC3"),
limits = c("year01", "year10"),
labels = c("Year 1 - 9", "Year 10")
) +
labs(
subtitle = paste(subtit),
title = str_wrap(tit, 45),
x = "",
y = "",
fill = ""
)
(Changed manual fill colour to distinguish from unfilled bars)
It's also possible to do by creating a new 2-level cohort_lumped variable before passing to ggplot(), but this way helps keep your data as consistent as possible up to the point of passing into graphing stages (and doesn't need extra columns storing essentially same information).

How do I change the symbol for just one legend entry using ggplot2?

I am trying to change the orange dot in the legend to be a diamond with a line through it. I have been unable to change only the one symbol; my attempts have either changed all of the symbols to diamonds, or the legend lists the shapes and colors separately.
Here's reproducible data:
data <- structure(list(Period = c(1, 2, 5, 4, 3),
y1 = c(0.0540540540540541, 0.0256410256410256, 0.454545454545455, 0.451612903225806, 0.333333333333333),
y2 = c(0.157894736842105, 0.163265306122449, 0.277027027027027, 0.289473684210526, 0.318181818181818),
y3 = c(0.0917, 0.1872, 0.1155, 0.0949, 0.2272)), row.names = c(NA, -5L),
class = c("tbl_df", "tbl", "data.frame"))
and
CIinfo <- structure(list(Period = c(1, 2, 3, 4, 5),
PointEstimate = c(0.09170907, 0.18715355, 0.22718774, 0.09494454, 0.11549015),
LowerCI = c(0.02999935,0.09032183, 0.1859676, 0.06469029, 0.08147854),
UpperCI = c(0.1534188, 0.2839853, 0.2684079, 0.1251988, 0.1495018)),
row.names = c(NA, 5L), class = "data.frame")
To generate the plot:
library(ggplot2)
library(ggtext) #text box for plot title
scatter <- ggplot(data) +
geom_point(aes(x=Period, y=y1, colour="prevalence"), size=4) + #colour is for legend label
geom_segment(data = CIinfo, aes(x=Period, y=LowerCI, xend=Period, yend=UpperCI, #bars for 95% CI
colour="estimated probability and 95%CI"),
size=2, lineend = "round", alpha=0.7, show.legend = FALSE) + #alpha is transparency
geom_point(aes(x=Period, y=y2, colour="median prevalence"), size=3) +
geom_point(aes(x=Period, y=y3, colour="estimated probability and 95%CI"), size=4, shape=18) +
theme_minimal() +
scale_color_manual(values = c("#d2d2d2","#365C8DFF","#EB6529FF"),
breaks = c("prevalence","median prevalence","estimated probability and 95%CI"), #set order of legend
labels = ~ stringr::str_wrap(.x, width = 28)) + #width of legend
labs(x = "Time Period",
title ="Estimated Probability and Prevalence Rates") +
theme(plot.title = element_textbox(hjust = 0.5, #center title
margin = margin(b = 15)), #pad under the title
plot.title.position = "plot",
axis.title.x = element_text(margin = margin(t = 10, r = 0, b = 0, l = 0)), #pad x axis label
axis.title.y = element_blank(), # remove y-axis label
axis.text = element_text(face="bold"), #bold axis labels
panel.grid.minor.x = element_blank(), # remove vertical minor gridlines
legend.title = element_blank(), # remove legend label
legend.spacing.y = unit(8, "pt") # space legend entries
) +
guides(colour = guide_legend(byrow = TRUE)) + # space legend entries
scale_y_continuous(labels = scales::percent, limits = c(0, .5)) # y-axis as %
scatter
Does something like this help? I'm using a random example, but hopefully it points you in the right direction:
library(tidyverse)
draw_key_custom <- function(data, params, size) {
if (data$colour == "orange") {
data$size <- .5
draw_key_pointrange(data, params, size)
} else {
data$size <- 2
draw_key_point(data, params, size)
}
}
mtcars |>
ggplot(aes(hp, mpg, color = as.factor(cyl)))+
geom_point(key_glyph = "custom")+
guides(color = guide_legend(
override.aes = list(shape = c(16,16,18),
color= c("black", "black", "orange")))
)
P.S. I borrowed some code from this question: R rotate vline in ggplot legend with scale_linetype_manual

I cannot add source to this chart, nor can I remove legend title

I have created a simple interactive chart, of Brazil bank NPLs (no longer interactive as saved as a png) using the below code, but need to refine the aesthetics a bit more to exclude the legend title (series.name) and also the source (caption) does not show at the bottom of the chart. I am relatively new to R, but I suspect the reason is that I am using the ggplot2 and plotly packages in combination? These functions work in my other charts which are all static. I would really appreciate some guidance here.
library(GetBCBData)
library(tidyverse)
library(plotly)
library(ggplot2)
id.series <- c(Total_NPL = 21082,
Corps_NPL = 21083,
Indivs_NPL = 21084)
first.date = '2011-09-01'
# get series from bcb
df_cred <- gbcbd_get_series(id = id.series,
first.date = first.date,
last.date = Sys.Date(),
use.memoise = FALSE)
# check output
glimpse(df_cred)
p <- ggplot(df_cred, aes(x =ref.date, y = value, colour = series.name)) +
geom_line() +
scale_colour_manual(values = c("chartreuse3","deepskyblue3", "darkorange2"))+
labs(y = "% of total loans",
x = '')+
labs(y = "% of total loans",
x = '',
title = "NPL Ratios Brazil % loans ",
caption = "Source: Source: SGS - BCB (by GetBCBData)") +
scale_x_date(date_breaks = "1 year", date_labels = "%b %y")+
theme(panel.background = element_rect(fill = "azure1"))+
theme(panel.grid.major = element_line(colour = "grey", linetype = "dotted"))+
theme(legend.position="top") +
theme(legend.title = element_blank())+
theme(legend.text = element_text(size = 12))+
theme(plot.title = element_text(face = "bold", size = 18))+
geom_text(show.legend = FALSE,aes(label = value),
data = df_cred %>% filter(ref.date == max(ref.date)),
nudge_x = 70,
nudge_y = 0.08,
size = 4.5)+
scale_y_continuous(limits = c(0,NA), expand = c(0,0))+
geom_hline(yintercept=0)
ggplotly(p)
In your ggplot call I did a few things differently. In scale_colour_manual, I added name = "". This removed the legend title.
You called labs two times, I commented one set out.
You called element_blank() for layout(legend.title... but without a name, it wasn't needed. (Removing the name through scale removed it from plotly, as well.)
After creating the ggplotly object, I adjusted the margins so that the legend and caption would be visible. I also grayed out the gridlines. FYI, dotted gridlines are currently not an option in plotly. Although, it is a current documented feature request.
One last thing-- you had "source: source:..." In the ggplotly call, I removed one of the "source:".
Your slightly modified ggplot call:
p <- ggplot(df_cred, aes(x =ref.date, y = value, colour = series.name)) +
geom_line() +
scale_colour_manual(values = c("chartreuse3","deepskyblue3", "darkorange2"),
name = "") +
# labs(y = "% of total loans",
# x = '')+
labs(y = "% of total loans",
x = '',
title = "NPL Ratios Brazil % loans ",
caption = "Source: Source: SGS - BCB (by GetBCBData)") +
scale_x_date(date_breaks = "1 year", date_labels = "%b %y") +
theme(panel.background = element_rect(fill = "azure1"))+
theme(panel.grid.major = element_line(colour = "grey", linetype = "dotted"))+
theme(legend.position="top") +
# theme(legend.title = element_blank())+ # this is not doing anything
theme(legend.text = element_text(size = 12))+
theme(plot.title = element_text(face = "bold", size = 18))+
geom_text(show.legend = FALSE,aes(label = value),
data = df_cred %>% filter(ref.date == max(ref.date)),
nudge_x = 70,
nudge_y = 0.08,
size = 4.5)+
scale_y_continuous(limits = c(0,NA), expand = c(0,0)) +
geom_hline(yintercept=0)
Updates post ggplotly:
ggplotly(p) %>%
layout(
margin = list(b = 50, t = 75), # make space to see the legend/caption
legend = list( # legend placement
orientation = "h",
y = 1.06, x = .5, xanchor = "center",
font = list(size = 12)),
annotations = list( # caption and placement
align = "right", showarrow = F, y = -.08, x = 1,
# x and y values uses the domain [0,1]
font = list(size = 10),
text = "Source: SGS - BCB (by GetBCBData)",
xref = "paper", yref = "paper"), # allow off the plot
xaxis = list(gridcolor = "lightgrey"), # gridlines
yaxis = list(gridcolor = "lightgrey") # gridlines
)
ggplot and ggplotly

ggplot: how to assign both color and shape for one factor, and also shape for another factor?

I must code factor age with both color and shape. I know how to do that (see the plot and data/code below).
In addition, I also have to code factor day with shape.
Is it somehow possible to assign specified different shapes to two different factors?
Below is the legend I would love to achieve (I made an example in power point).
The plot is not quite right, as only factor age is coded with color and shape.
df = data.frame(test = c(1,2,3, 1,2,3, 1,2,3, 1,2,3, 1,2,3, 1,2,3),
age = c(1,1,1, 2,2,2, 3,3,3, 1,1,1, 2,2,2, 3,3,3),
day = c(1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2),
result = c(1,2,2,1,1,2,2,1,0, 2,2,0,1,2,1,2,1,0))
df$test <- factor((df$test))
df$age <- factor((df$age))
df$day <- factor((df$day))
windows(width=4, height=3 )
df %>%
ggplot( aes(x=test, y=result)) +
geom_point(aes(color=age, shape=age, group=age),
position=position_jitterdodge(dodge.width=0.8,jitter.height=0.2, jitter.width=0),
size=2, stroke=0.8) +
scale_shape_manual(values=c(16, 15, 17), name="", labels=c("young","older","the oldest")) +
scale_color_manual(name="", labels=c("young","older","the oldest"), values=c('#009E73','#56B4E9','#D55E00')) +
theme_bw()+
theme(panel.border = element_blank(), axis.ticks = element_blank(),
legend.position=c(), legend.text=element_text(size=10, face="bold"), legend.title=element_text(size=10),
panel.grid.major.x = element_blank() ,
panel.grid.major.y = element_blank() ,
plot.title = element_text(size=10, face = "bold"), axis.title=element_text(size=11),
axis.text.y = element_text(size=9, angle = 45),
axis.text.x = element_text(size=9, angle = 90),
plot.margin = unit(c(0.5,0.2,0,0), "cm")) +
labs(title= "", x = "",y = "Test result") +
scale_y_continuous(breaks=c(0,1,2), labels=c('good','better','the best')) +
geom_vline(xintercept=c(0.5,1.5,2.5),color="grey90")+
geom_hline(yintercept=-0.5, color="grey90")+
expand_limits(x=3.9, y=c(0,2.35)) +
scale_x_discrete(limits = c("1", "2", "3"),labels = c("test a", "test b", "test c")) +
coord_cartesian(clip = "off")
You can use shapes on an interaction between age and day, and use color only one age. Then remove the color legend and color the shape legend manually with override.aes.
This comes close to what you want - labels can be changes, I've defined them when creating the factors.
how to make fancy legends
However, you want a quite fancy legend, so the easiest would be to build the legend yourself as a separate plot and combine to the main panel. ("Fake legend"). This requires some semi-hardcoding, but you're not shy to do this anyways given the manual definition of your shapes. See Part Two how to do this.
Part one
library(ggplot2)
df = data.frame(test = c(1,2,3, 1,2,3, 1,2,3, 1,2,3, 1,2,3, 1,2,3),
age = c(1,1,1, 2,2,2, 3,3,3, 1,1,1, 2,2,2, 3,3,3),
day = c(1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2),
result = c(1,2,2,1,1,2,2,1,0, 2,2,0,1,2,1,2,1,0))
df$test <- factor(df$test)
## note I'm changing this here already!! If you udnergo the effor tof changing to
## factor, define levels and labels here
df$age <- factor(df$age, labels = c("young", "older", "the oldest"))
df$day <- factor(df$day, labels = paste("Day", 1:2))
ggplot(df, aes(x=test, y=result)) +
geom_jitter(aes(color=age, shape=interaction(day, age)),
width = .1, height = .1) +
## you won't get around manually defining the shapes
scale_shape_manual(values = c(0, 15, 1, 16, 2, 17)) +
scale_color_manual(values = c('#009E73','#56B4E9','#D55E00')) +
guides(color = "none",
shape = guide_legend(
override.aes = list(color = rep(c('#009E73','#56B4E9','#D55E00'), each = 2)),
ncol = 3))
Part two - the fake legend
library(ggplot2)
library(dplyr)
library(patchwork)
## df and factor creation as above !!!
p_panel <-
ggplot(df, aes(x=test, y=result)) +
geom_jitter(aes(color=age, shape=interaction(day, age)),
width = .1, height = .1) +
## you won't get around manually defining the shapes
scale_shape_manual(values = c(0, 15, 1, 16, 2, 17)) +
scale_color_manual(values = c('#009E73','#56B4E9','#D55E00')) +
## for this solution, I'm removing the legend entirely
theme(legend.position = "none")
## make the data frame for the fake legend
## the y coordinates should be defined relative to the y values in your panel
y_coord <- c(.9, 1.1)
df_legend <- df %>% distinct(day, age) %>%
mutate(x = rep(1:3,2), y = rep(y_coord,each = 3))
## The legend plot is basically the same as the main plot, but without legend -
## because it IS the legend ... ;)
lab_size = 10*5/14
p_leg <-
ggplot(df_legend, aes(x=x, y=y)) +
geom_point(aes(color=age, shape=interaction(day, age))) +
## I'm annotating in separate layers because it keeps it clearer (for me)
annotate(geom = "text", x = unique(df_legend$x), y = max(y_coord)+.1,
size = lab_size, angle = 45, hjust = 0,
label = c("young", "older", "the oldest")) +
annotate(geom = "text", x = max(df_legend$x)+.2, y = y_coord,
label = paste("Day", 1:2), size = lab_size, hjust = 0) +
scale_shape_manual(values = c(0, 15, 1, 16, 2, 17)) +
scale_color_manual(values = c('#009E73','#56B4E9','#D55E00')) +
theme_void() +
theme(legend.position = "none",
plot.margin = margin(r = .3,unit = "in")) +
## you need to turn clipping off and define the same y limits as your panel
coord_cartesian(clip = "off", ylim = range(df$result))
## now combine them
p_panel + p_leg +
plot_layout(widths = c(1,.2))
So again following: Filled and hollow shapes where the fill color = the line color the following code provides the goods without giving you the legend.
df %>%
ggplot( aes(x=test, y=result)) +
geom_point(aes(color=age,
shape=age,
group=age,
fill=factor(ifelse(day==1, NA, age))), # STEP 1
position=position_jitterdodge(dodge.width=0.8,jitter.height=0.2, jitter.width=0),
size=2, stroke=0.8) +
scale_shape_manual(values=c(22,21,24), name="", labels=c("young","older","the oldest")) +
scale_color_manual(name="", labels=c("young","older","the oldest"), values=c('#009E73','#56B4E9','#D55E00')) +
scale_fill_manual(name="",
labels=c("young","older","the oldest"),
values=c('#009E73','#56B4E9','#D55E00'),
na.value=NA, guide="none") # STEP 2
I was misleading in my comment, rather than "hallow" shapes, we want shapes 21 through 26. These apparently accept distinct fill and color.

Complex Chart in R/ggplot with Proper Legend Display

This is my first question to StackExchange, and I've searched for answers that have been helpful, but haven't really gotten me to where I'd like to be.
This is a stacked bar chart, combined with a point chart, combined with a line.
Here's my code:
theme_set(theme_light())
library(lubridate)
FM <- as.Date('2018-02-01')
x.range <- c(FM - months(1) - days(1) - days(day(FM) - 1), FM - days(day(FM) - 1) + months(1))
x.ticks <- seq(x.range[1] + days(1), x.range[2], by = 2)
#populate example data
preds <- data.frame(FM = FM, DATE = seq(x.range[1] + days(1), x.range[2] - days(1), by = 1))
preds <- data.frame(preds, S_O = round(seq(1, 1000000, by = 1000000/nrow(preds))))
preds <- data.frame(preds, S = round(ifelse(month(preds$FM) == month(preds$DATE), day(preds$DATE) / 30.4, 0) * preds$S_O))
preds <- data.frame(preds, O = preds$S_O - preds$S)
preds <- data.frame(preds, pred_sales = round(1000000 + rnorm(nrow(preds), 0, 10000)))
preds$ma <- with(preds, stats::filter(pred_sales, rep(1/5, 5), sides = 1))
y.max <- ceiling(max(preds$pred_sales) / 5000) * 5000 + 15000
line.cols <- c(O = 'palegreen4', S = 'steelblue4',
P = 'maroon', MA = 'blue')
fill.cols <- c(O = 'palegreen3', S = 'steelblue3',
P = 'red')
p <- ggplot(data = preds,
mapping = aes(DATE, pred_sales))
p <- p +
geom_bar(data = reshape2::melt(preds[,c('DATE', 'S', 'O')], id.var = 'DATE'),
mapping = aes(DATE, value, group = 1, fill = variable, color = variable),
width = 1,
stat = 'identity',
alpha = 0.5) +
geom_point(mapping = aes(DATE, pred_sales, group = 2, fill = 'P', color = 'P'),
shape = 22, #square
alpha = 0.5,
size = 2.5) +
geom_line(data = preds[!is.na(preds$ma),],
mapping = aes(DATE, ma, group = 3, color = 'MA'),
alpha = 0.8,
size = 1) +
geom_text(mapping = aes(DATE, pred_sales, label = formatC(pred_sales / 1000, format = 'd', big.mark = ',')),
angle = 90,
size = 2.75,
hjust = 1.25,
vjust = 0.4) +
labs(title = sprintf('%s Sales Predictions - %s', 'Overall', format(FM, '%b %Y')),
x = 'Date',
y = 'Volume in MMlbs') +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1, size = 8),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
legend.title = element_blank(),
legend.position = 'bottom',
legend.text = element_text(size = 8),
legend.margin = margin(t = 0.25, unit = 'cm')) +
scale_x_date(breaks = x.ticks,
date_labels = '%b %e',
limits = x.range) +
scale_y_continuous(limits = c(0, y.max),
labels = function(x) { formatC(x / 1000, format='d', big.mark=',') }) +
scale_color_manual(values = line.cols,
breaks = c('MA'),
labels = c(MA = 'Mvg Avg (5)')) +
scale_fill_manual(values = fill.cols,
breaks = c('P', 'O', 'S'),
labels = c(O = 'Open Orders', S = 'Sales', P = 'Predictions'))
p
The chart it generates is this:
As you can see, the legend does a couple of funky things. It's close, but not quite there. I only want boxes with exterior borders for Predictions, Open Orders, and Sales, and only a blue line for the Mvg Avg (5).
Any advice would be appreciated.
Thanks!
Rather late, but if you are still interested to understand this problem, the following should work. Explanations are included as comments within the code:
library(dplyr)
preds %>%
# scale the values for ALL numeric columns in the dataset, before
# passing the dataset to ggplot()
mutate_if(is.numeric, ~./1000) %>%
# since x / y mappings are stated in the top level ggplot(), there's
# no need to repeat them in the subsequent layers UNLESS you want to
# override them
ggplot(mapping = aes(x = DATE, y = pred_sales)) +
# 1. use data = . to inherit the top level data frame, & modify it on
# the fly for this layer; this is neater as you are essentially
# using a single data source for the ggplot object.
# 2. geom_col() is a more succinct way to say geom_bar(stat = "identity")
# (I'm using tidyr rather than reshape package, since ggplot2 is a
# part of the tidyverse packages, & the two play together nicely)
geom_col(data = . %>%
select(S, O, DATE) %>%
tidyr::gather(variable, value, -DATE),
aes(y = value, fill = variable, color = variable),
width = 1, alpha = 0.5) +
# don't show legend for this layer (o/w the fill / color legend would
# include a square shape in the centre of each legend key)
geom_point(aes(fill = 'P', color = 'P'),
shape = 22, alpha = 0.5, size = 2.5, show.legend = FALSE) +
# use data = . %>% ... as above.
# since the fill / color aesthetic mappings from the geom_col layer would
# result in a border around all fill / color legends, avoid it all together
# here by hard coding the line color to "blue", & map its linetype instead
# to create a separate linetype-based legend later.
geom_line(data = . %>% na.omit(),
aes(y = ma, linetype = 'MA'),
color = "blue", alpha = 0.8, size = 1) +
# scales::comma is a more succinct alternative to formatC for this use case
geom_text(aes(label = scales::comma(pred_sales)),
angle = 90, size = 2.75, hjust = 1.25, vjust = 0.4) +
labs(title = sprintf('%s Sales Predictions - %s', 'Overall', format(FM, '%b %Y')),
x = 'Date',
y = 'Volume in MMlbs') +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1, size = 8),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
legend.title = element_blank(),
legend.position = 'bottom',
legend.text = element_text(size = 8),
legend.margin = margin(t = 0.25, unit = 'cm')) +
scale_x_date(breaks = x.ticks,
date_labels = '%b %e',
limits = x.range) +
# as above, scales::comma is more succinct
scale_y_continuous(limits = c(0, y.max / 1000),
labels = scales::comma) +
# specify the same breaks & labels for the manual fill / color scales, so that
# a single legend is created for both
scale_color_manual(values = line.cols,
breaks = c('P', 'O', 'S'),
labels = c(O = 'Open Orders', S = 'Sales', P = 'Predictions')) +
scale_fill_manual(values = fill.cols,
breaks = c('P', 'O', 'S'),
labels = c(O = 'Open Orders', S = 'Sales', P = 'Predictions')) +
# create a separate line-only legend using the linetype mapping, with
# value = 1 (i.e. unbroken line) & specified alpha / color to match the
# geom_line layer
scale_linetype_manual(values = 1,
label = 'Mvg Avg (5)',
guide = guide_legend(override.aes = list(alpha = 1,
color = "blue")))

Resources