Secondary axis in ggplot [duplicate] - r

I have the following tibble format and i want to create a chart with two y-axis.
sample <- climate <- tibble(
Month = c("1/1/2019","2/1/2019","3/1/2019","4/1/2019","5/1/2019","6/1/2019","7/1/2019","8/1/2019","9/1/2019","10/1/2019","11/1/2019","12/1/2019","1/1/2020","2/1/2020","3/1/2020"),
Reactions = c(52111,37324,212695,152331,24973,10878,7413,8077,13066,50486,8087,12600,31625,25578,20069),
Ratio = c(1371,1866,6445,4914,925,363,218,245,335,1530,352,525,1506,1112,873)
)
Here's what i tried so far.
ggplot() +
geom_bar(mapping = aes(x = sample$Month, y = sample$Reactions), stat = 'identity') +
geom_line(mapping = aes(x = sample$Month , y = sample$Ratio), size = 2, color = "red") +
scale_y_continuous(name = "Reactions per Month", sec.axis = sec_axis(trans = ~./20, name = "Reactions/ post"))
Any help will be appreciated

you have to recode Month column as date, and multiply Ratio times 20 (since you devided second axis by 20):
library(lubridate)
sample$Month <- mdy(sample$Month)
ggplot() +
geom_bar(mapping = aes(x = sample$Month, y = sample$Reactions), stat = 'identity') +
geom_line(mapping = aes(x = sample$Month , y = sample$Ratio*20), size = 2, color = "red") +
scale_y_continuous(name = "Reactions per Month", sec.axis = sec_axis(trans = ~./20, name = "Reactions/ post"))
you can also improve your code with use of data variable inside ggplot()
ggplot(sample, aes(x = Month)) +
geom_bar(aes(y = Reactions), stat = 'identity') +
geom_line(aes(y = Ratio*20), size = 2, color = "red") +
scale_y_continuous(name = "Reactions per Month", sec.axis = sec_axis(trans = ~./20, name = "Reactions/ post"))
Plot:

Related

change starting value for geom_bar

I have a plot that includes data from two different scales. So far, I've plotted both variables and adjusted the scale of one variable (ss) so that it is closer to the other variables. This greatly reduced the white space in the middle of the plot.
set.seed = 42
df <- data.frame(
cat = runif(10, 1, 20),
mean = runif(10, 350, 450),
ss = runif(10, 1, 50))
ggplot(data = df) +
geom_bar(aes(x = cat, y = ss + 250),
stat = "identity",
fill = "red") +
geom_point(aes(x = cat, y = mean)) +
geom_smooth(aes(x = cat, y = mean),
method = "loess", se = TRUE) +
scale_y_continuous(sec.axis = sec_axis(trans = ~.-250,
name = "sample size")) +
labs(y = "mean") +
theme_bw()
However, I don't love the really long bars for sample size, and I'd like to change the limits on the left y axis so that it starts 250 (where ss = 0). Unfortunately, if I replace my current scale_y_continuous parameter with limits (see below), then the bars disappear. How do I do this?
ggplot(data = df) +
geom_bar(aes(x = cat, y = ss + 250),
stat = "identity",
fill = "red") +
geom_point(aes(x = cat, y = mean)) +
geom_smooth(aes(x = cat, y = mean),
method = "loess", se = TRUE) +
scale_y_continuous(limits = c(250, 510), ### NEW Y AXIS LIMITS
sec.axis = sec_axis(trans = ~.-250,
name = "sample size")) +
labs(y = "mean") +
theme_bw()
EDIT: Updated plot with #AllanCameron's suggestion. This is really close, but it has the values of the bars extend below 0 on the secondary axis.
ggplot(data = df) +
geom_bar(aes(x = cat, y = ss + 250),
stat = "identity",
fill = "red") +
geom_point(aes(x = cat, y = mean)) +
geom_smooth(aes(x = cat, y = mean),
method = "loess", se = TRUE) +
scale_y_continuous(sec.axis = sec_axis(trans = ~.-250,
name = "sample size")) +
labs(y = "mean") +
theme_bw() +
coord_cartesian(ylim = c(250, 510)) ### NEW
Just expand parameter in scale_y_continuous() to c(0,0).
This tells ggplot2 to not add padding to the plot box.
ggplot(data = df) +
geom_bar(aes(x = cat, y = ss + 250),
stat = "identity",
fill = "red") +
geom_point(aes(x = cat, y = mean)) +
geom_smooth(aes(x = cat, y = mean),
method = "loess", se = TRUE) +
scale_y_continuous(sec.axis = sec_axis(trans = ~.-250, name = "sample size"),
expand = c(0,0)) + # New line here!
labs(y = "mean") +
theme_bw() +
coord_cartesian(ylim = c(250, 510))

How to graph two different columns on one ggplot?

I am trying to plot one column by Date (different color points for each animal category) and on the same graph, plot a second column by Date as well. The second column has entries for the days but only for certain categories, Large Dog. There is no adoption_with_discount for small or medium dogs (please see the reproducible example data set, example_data). When I plot them separately they visualize fine but not when plotted together. I thought I would just overlay a separate geom but that is not working.
I want to combine the two plots into one. My goal is for the points plot to have the line graph on top of it. I am trying to visualize the adoption as points colored by animal and put a line on the same graph of adoption_with_discount.
Thank you for your help!
# Make example -----------------------------------------------------------
# Here is an example data set
# You can see in the `adoption_with_discount` the values I want to add as a line.
library(lubridate)
library(tidyverse)
example_days <- data.frame(Date = c(seq.Date(from = as.Date('2022-03-01'), to = as.Date('2022-04-30'), by = 'days')))
example_small <-
example_days %>%
mutate(animal = "Small Dog")
a <-sample(100:150, nrow(example_small), rep = TRUE)
example_small <-
example_small %>%
mutate(adoption = a,
adoption_with_discount = NA)
example_med <-
example_days %>%
mutate(animal = "Medium Dog")
b <-sample(150:180, nrow(example_med), rep = TRUE)
example_med <-
example_med %>%
mutate(adoption = b,
adoption_with_discount = NA)
example_large <-
example_days %>%
mutate(animal = "Large Dog")
c <-sample(150:200, nrow(example_large), rep = TRUE)
example_large <-
example_large %>%
mutate(adoption = c)
example_large <-
example_large %>%
mutate(adoption_with_discount = adoption - 15)
example_data <- rbind(example_small, example_med, example_large)
# Plot --------------------------------------------------------------------
ggplot(data = example_data) +
geom_point(mapping = aes(x = Date,
y = adoption,
color = animal)) +
ggtitle("Dog Adoption by Size") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45))
# Plot with Fee -----------------------------------------------------------
# This is where the problem is occurring
# When I want to add a line that plots the adoption with discount by day
# on top of the points, it does not populate.
ggplot(data = example_data) +
geom_point(mapping = aes(x = Date,
y = adoption,
color = animal)) +
geom_line(mapping = aes(x = Date,
y = adoption_with_discount),
color = "black") +
ggtitle("Dog Adoption by Size with Discount Included") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45))
# See if just Discount will Plot -----------------------------------------
#This plots separately
ggplot(data = example_large) +
geom_line(mapping = aes(x = Date,
y = adoption_with_discount),
color = "black") +
ggtitle("Discount") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45))
While subsetting is an option to fix the issue, the reason why no line is plotted is simply the missing grouping, i.e. in geom_line you are trying to plot observations for all three dog types as one group or line. However, because of the NAs no line will show up. An easy option to solve that would be to explicitly map animal on the group aes. Additionally I added na.rm=TRUE to silent the warning about removed NAs. Finally I right aligned your axis labels by adding hjust=1:
library(ggplot2)
ggplot(data = example_data) +
geom_point(mapping = aes(
x = Date,
y = adoption,
color = animal
)) +
geom_line(
mapping = aes(
x = Date,
y = adoption_with_discount,
group = animal
),
color = "black",
na.rm = TRUE
) +
ggtitle("Dog Adoption by Size with Discount Included") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Based on discussion here I found that you can use subset argument in the aes of geom_line to select values that are not NAs in adoption_with_discount column.
ggplot(data = example_data) +
geom_point(mapping = aes(x = Date,
y = adoption,
color = animal)) +
geom_line(mapping = aes(x = Date,
y = adoption_with_discount),
color = "black") +
ggtitle("Dog Adoption by Size with Discount Included") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45)) +
geom_line(mapping = aes(x = Date,
y = adoption_with_discount,
subset = !is.na(adoption_with_discount)),
color = "black") +
ggtitle("Discount") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45))
The result:
It looks like it is the NA that are included in the geom_line portion that is creating the issue so you can filter those out before plotting the line:
geom_point(mapping = aes(x = Date,
y = adoption,
color = animal)) +
geom_line(data=example_data %>% filter(!is.na(adoption_with_discount)),
mapping = aes(x = Date,
y = adoption_with_discount),
color = "black") +
ggtitle("Dog Adoption by Size with Discount Included") +
labs(x = "Date", y = "Adoption Fee") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 45))

Create a legend, change Y-axis for a ggplot2 plot that has a geom_col and 4 geom_line plots

I have looked at this articles: Legend formatting for ggplot with geom_col and geom_line and: Controlling legend appearance in ggplot2 with override.aes
I have a plot with a geom_col layer and 4 geom_line layers. I would like to be able to show a legend with no title using the colors that I chose when generating each layer in Plot 1. Alternatively, if I could change the colors and legend title in Plot 2 that might work. I would also like to change the Y-axis scale since the production data will never drop below 20 acre-feet but when I use scale_y_continuous with geom_col the bars do not appear.
Data
structure(list(Month_Day = structure(c(18628, 18629, 18630, 18631,
18632, 18633, 18634, 18635, 18636, 18637), class = "Date"), `2021` = c(37.57464179,
35.95859072, 39.14746726, 37.9630674, 40.55096688, 42.53487456,
41.66243889, 40.18150773, 37.90217088, 38.30979013), Mean = c(36.9619627272733,
36.2260000000002, 36.4146654545455, 36.3741972727362, 36.9979654545361,
36.5846318181822, 35.7040999999992, 36.3543818181826, 36.3387527272723,
36.5865854545457), Median = c(36.44704, 35.9190000000022, 35.56288,
36.19223, 35.82997, 36.21986, 35.69489, 35.07475, 36.29047, 36.18302
), Maximum = c(40.2170000000067, 42.43354, 41.04897, 39.2959999999978,
42.3659999999978, 40.524, 41.24238, 41.48184, 38.49166, 39.26223
), Minimum = c(34.1998, 31.78064, 33.38932, 34.74012, 33.96955,
31.54425, 31.48285, 33.49063, 35.02563, 34.7831)), row.names = c(NA,
10L), class = "data.frame")
Plot 1
library(ggplot2)
library(scales)
p_2021 <- ggplot(dfinal21, aes(x = Month_Day)) +
geom_col(aes(y = `2021`), size = 1, color = "darkblue", fill = "white") +
geom_line(aes(y = `Median`), size = 1.0, color="red") +
geom_line(aes(y = `Mean`), size = 1.0, color="yellow") +
geom_line(aes(y = `Maximum`), size = 1.0, color="darkgreen") +
geom_line(aes(y = `Minimum`), size = 1.0, color = "darkviolet") +
scale_x_date(breaks = "1 month", minor_breaks = "1 day", labels=date_format("%b")) +
labs(x = "Month", y = "acre-feet", title = "Treatment Plant Production 2021")
Generates this plot:
Plot 2
library(ggplot2)
library(scales)
p_2021 <- ggplot(dfinal21) +
geom_col(aes(x = Month_Day, y = `2021`, color = "2021")) +
geom_line(aes(x = Month_Day, y = `Median`, color = "Median")) +
geom_line(aes(x = Month_Day, y = `Mean`, color= "Mean")) +
geom_line(aes(x = Month_Day, y = `Maximum`, color= "Maximum")) +
geom_line(aes(x = Month_Day, y = `Minimum`, color= "Minimum")) +
scale_x_date(breaks = "1 month", minor_breaks = "1 day", labels=date_format("%b")) +
labs(x = "Month", y = "acre-feet", title = "Treatment Plant Production 2021") +
theme(legend.background = element_rect(fill = "white"))
Generates this plot:
It turns out that I did not have my data in the format that ggplot2 likes (Tidy data). An acquaintance helped me out. Here's the new code:
dfinal21a <- dfinal21 %>%
mutate(y2021 = `2021`) %>% #rename column
select(-(`2021`)) %>% # remove old col name
pivot_longer(cols = Mean:y2021, names_to = "measure", values_to = "data") #tidy the data
p_2021 <- ggplot() +
geom_col(data = filter(dfinal21a, measure == "y2021"),
aes(y = data, x = Month_Day),
color = "darkblue", fill = "white") +
geom_line(data = filter(dfinal21a, measure != "y2021"),
aes(y = data, x = Month_Day, color = measure)) +
scale_color_manual(values = c("red", "yellow", "darkgreen", "darkviolet")) +
coord_cartesian(ylim = c(20,120)) + #changed range to include maximums
theme(legend.title = element_blank()) +
scale_color_discrete(name = "Year") +
scale_x_date(breaks = "1 month", minor_breaks = "1 day", labels=date_format("%b")) +
labs(x = "Month", y = "acre-feet", title = "Treatment Plant Production 2021")
Here is a solution! If this is what you are looking for let me know. I then can explain the code:
The code:
library(tidyverse)
library(scales)
# data manipulation
dfinal21_long <- dfinal21 %>%
pivot_longer(
cols = c(`2021`, Mean, Median, Maximum, Minimum),
names_to = "names",
values_to = "values"
) %>%
mutate(color = case_when(names=="Mean" ~ "red",
names=="Median" ~ "yellow",
names=="Maximum" ~ "darkgreen",
names=="Minimum" ~ "darkviolet")) %>%
arrange(names)
# function for custom transformation of y axis
shift_trans = function(d = 0) {
scales::trans_new("shift", transform = function(x) x - d, inverse = function(x) x + d)
}
# plot
ggplot() +
geom_col(data= dfinal21_long[1:10,],mapping = aes(x=Month_Day,y=values), size = 1, color = "darkblue", fill = "white" ) +
geom_line(data= dfinal21_long[11:50,],mapping=aes(x=Month_Day,y=values, color=color), size = 1.0) +
scale_x_date(breaks = "1 month", minor_breaks = "1 day", labels=date_format("%b")) +
labs(x = "Month", y = "acre-feet", title = "Treatment Plant Production 2021") +
scale_y_continuous(trans = shift_trans(20))+
scale_color_manual(labels = c("Maximum","Mean","Median","Minimum"), values = c("red","yellow","darkgreen","darkviolet")) +
theme_minimal() +
theme(legend.position="bottom") +
theme(legend.title=element_blank())

Making legends with ggplot2 [duplicate]

This question already has answers here:
How to add legend to ggplot manually? - R [duplicate]
(1 answer)
Missing legend with ggplot2 and geom_line
(1 answer)
Closed 1 year ago.
I'm trying to plot some poll results but always have to add the legend manually after creating the plot. Here's my data (poll.csv):
Date,AKP,MHP,CHP,IYI,HDP,DEVA,GP,SP
01-05-2021,34.8,9,27.2,13.1,8.9,3.1,1.8,1.6
01-06-2021,36.2,9.1,20.3,16.7,10.5,3.6,1.3,0.6
01-14-2021,35,9.5,25,13,10.7,2.5,2.6,0.8
01-27-2021,35.6,8.5,24.9,13.6,8.9,2.3,2.5,1.6
01-30-2021,35.6,8.1,25.3,13.4,10.5,2.9,3,0.9
02-06-2021,35.8,7.8,24.2,12.7,11.4,4.6,1.3,0.8
02-08-2021,39.3,10.5,22.6,13.1,10.2,,,1.2
02-13-2021,35.2,9.6,21.1,15.6,10.6,3.8,1.7,1
02-13-2021,34.1,7.5,27.3,14,10.1,2.4,2.7,1.4
02-15-2021,35.2,9,26.5,12.9,9.9,2.5,1,1.5
02-23-2021,36.8,11.6,22.3,10.8,8.2,2.1,2.6,1.3
02-23-2021,41.7,11.6,21.8,10.3,9.3,1.8,0.4,0.8
02-26-2021,36.1,10,24.1,14,10.1,1.6,1.4,1.3
03-03-2021,34.8,7,26.2,13.1,9.1,2.5,2.7,0.7
03-06-2021,36.5,8.1,24,13.1,11,4,,
03-08-2021,36,9.9,26.3,13.5,8.7,2.1,1.9,1.1
03-10-2021,36,,29,,10.4,,,
03-11-2021,35.4,8.3,22,16.5,11.3,2.9,1.2,1.2
03-14-2021,34.4,7.2,26.2,13.9,9.6,2.7,2.8,0.8
and here's my code:
library(ggplot2)
library(anytime)
poll = read.csv("poll.csv")
poll$Date = as.Date(anydate(poll$Date))
ggplot(poll) +
geom_point(aes(x = Date, y = AKP), color = '#ff8700') +
geom_smooth(aes(x = Date, y = AKP), color = '#ff8700') +
geom_point(aes(x = Date, y = CHP), color = '#ff0000') +
geom_smooth(aes(x = Date, y = CHP), color = '#ff0000') +
geom_point(aes(x = Date, y = MHP), color = '#ff0019') +
geom_smooth(aes(x = Date, y = MHP), color = '#ff0019', se = FALSE) +
geom_point(aes(x = Date, y = HDP), color = '#8000ff') +
geom_smooth(aes(x = Date, y = HDP), color = '#8000ff', se = FALSE) +
geom_point(aes(x = Date, y = IYI), color = '#3db5e6') +
geom_smooth(aes(x = Date, y = IYI), color = '#3db5e6', se = FALSE) +
geom_point(aes(x = Date, y = SP), color = '#f50002') +
geom_smooth(aes(x = Date, y = SP), color = '#f50002', se = FALSE) +
geom_point(aes(x = Date, y = DEVA), color = '#0061a1') +
geom_smooth(aes(x = Date, y = DEVA), color = '#0061a1', se = FALSE) +
geom_point(aes(x = Date, y = GP), color = '#00564a') +
geom_smooth(aes(x = Date, y = GP), color = '#00564a', se = FALSE) +
scale_x_date(date_breaks = "3 months", date_labels = "%m/%y") +
labs(x = "", y = "") +
theme(axis.text.x = element_text(size = 20),
axis.text.y = element_text(size = 20))
I searched for answers but couldn't make any one of them work. How can I add a legend to the plot?

barplot with lineplot - secondary axis

After referring to multiple links i have got to the below code however i still am not succeeding to get the line with labels. I suspect some mistake in sec.axis transformation but i can't figure it out.
# dummy data
df_dummy = data.frame('Plan_code'=c('A','B','C','D','E','F','G'),
'Total'=c(191432,180241,99164,58443,56616,29579,19510),'STP'=c(41,40,44,37,37,37,45))
# creation of plot
[![g <- ggplot(data = df_dummy, aes(x = Plan_code, y = Total)) +
geom_col(aes(fill = 'Total')) +
geom_line(data = df_dummy, aes(x = Plan_code, y = STP,group=1)) +
geom_point(data = df_dummy, aes(x = Plan_code,y=STP)) +
geom_label(data = df_dummy, aes(x = Plan_code, y = STP, fill = Plan_code, label = paste0('%', STP)), color = 'white', vjust = 1.6, size = 3) +
scale_y_continuous(sec.axis = sec_axis(~. / 2000, name = 'PERCENT')) +
labs(fill = NULL, color = NULL) +
theme_minimal()
print(g)][1]][1]
Like that?
g <- ggplot(data = df_dummy, aes(x = Plan_code, y = Total)) +
geom_col(aes(fill = 'Total')) +
geom_point(data = df_dummy, aes(x = Plan_code,y=STP * 2000)) +
geom_label(data = df_dummy, aes(x = Plan_code, y = STP *2000, fill = Plan_code, label = paste0('%', STP)), color = 'white', vjust = 1.6, size = 3) +
scale_y_continuous(sec.axis = sec_axis(~. / 2000, name = 'PERCENT'))+
geom_line(data = df_dummy, aes(x = Plan_code, y = STP * 2000,group=1), col = 'blue') +
theme(axis.text.y.right = element_text(color = 'blue'),axis.title.y.right = element_text(color = 'blue'))
labs(fill = NULL, color = NULL) +
theme_minimal()
I just multiplied your data with 2000, so that the absolute y-coordinates were right.
And I changed the color.

Resources