How to annotate a Gadfly heatmap? - julia

Coming from Python I'm trying to reproduce this Seaborn plot in Julia using the Gadfly package. I've two questions:
How to annotate this heatmap with the actual values per cell without "duplicating" lines of code?
And how to modify the xticks to show all the year values from 1949 to 1960?
My code so far:
using DataFrames
using CSV
using Gadfly
using Compose
using ColorSchemes
download("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/flights.csv", "flights.csv");
flights = DataFrame(CSV.File("flights.csv"))
flights_unstacked = unstack(flights, :month, :year, :passengers)
set_default_plot_size(16cm, 12cm)
plot(
flights,
x=:year,
y=:month,
color=:passengers,
Geom.rectbin,
Scale.ContinuousColorScale(palette -> get(ColorSchemes.magma, palette)),
Guide.xticks(ticks=[minimum(flights.year):maximum(flights.year);]),
Theme(background_color = "white"),
Guide.annotation(compose(context(), text(fill(1949, 12), 1:12, string.(flights_unstacked[:, "1949"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1950, 12), 1:12, string.(flights_unstacked[:, "1950"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1951, 12), 1:12, string.(flights_unstacked[:, "1951"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1952, 12), 1:12, string.(flights_unstacked[:, "1952"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1953, 12), 1:12, string.(flights_unstacked[:, "1953"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1954, 12), 1:12, string.(flights_unstacked[:, "1954"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1955, 12), 1:12, string.(flights_unstacked[:, "1955"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1956, 12), 1:12, string.(flights_unstacked[:, "1956"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1957, 12), 1:12, string.(flights_unstacked[:, "1957"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1958, 12), 1:12, string.(flights_unstacked[:, "1958"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1959, 12), 1:12, string.(flights_unstacked[:, "1959"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
Guide.annotation(compose(context(), text(fill(1960, 12), 1:12, string.(flights_unstacked[:, "1960"]), [hcenter], [vcenter]), fontsize(7pt), stroke("white"))),
)

The year xticks answer is to add
Guide.xticks(ticks=[minimum(flights.year):maximum(flights.year);]),
to the plot statement.
You then need a Guide.annotation() statement for the annotations. It needs some tuning to look the same as Seaborn's, but this does what you need:
Guide.annotation(
compose(
context(),
text(
flights.year,
12:-1:1,
string.(flights.passengers),
[hcenter for x in flights.passengers],
),
fontsize(2.5),
stroke("white"),
),

Related

Problem with plot in R: reducing spce within labels and bars, but x axis ticks disappear

I have the following code with its corresponding plot:
ggplot(df2, aes(x=Fecha.inicio, xend=Fecha.final, y=Ministro.a, yend=Ministro.a, color=Presidente)) +
theme_minimal()+ #use ggplot theme with black gridlines and white background
geom_segment(size=3) + #increase line width of segments in the chart
theme(plot.title=element_text(size=10, face="bold", hjust=0.5))+
theme(legend.title = element_text(size=9))+
theme(legend.background = element_rect(fill = "white", linetype="solid", colour="darkblue", size=0.05))+
theme(legend.text=element_text(size = 9, colour = "black"))+
theme(legend.key.height= unit(0.1, 'cm'),
legend.key.width= unit(0.3, 'cm'),
legend.key.size= unit(0.2, 'cm'))+
theme(axis.title.y = element_blank(),
axis.text.y = element_text(size=8),
axis.text.x = element_text(size=8))+
theme(axis.ticks.length = unit(0, "mm"))+
theme(legend.position = c(0.8, 0.8))+
#scale_x_continuous(expand = c(.01, .01)) +
scale_fill_identity(guide = "none") +
scale_color_viridis(discrete=T,
breaks=c("Toledo", "García", "Humala", "De PPk a Sagasti","Castillo"))+
labs("title"="Perú. Duración de los ministros de Agricultura (2001-2022)", x="")+
labs(caption=" Elaboración: Ivan Ramírez (#peonpasado_)\n Fecha: 19/07/2022")+
theme(plot.caption= element_text(size=7,
color="black",
face="bold",
hjust = -0.28,
vjust = 1))
I wanted to reduce the space within the bars and the label in the y axis. So, I added the theme that it is commented in the previous code. This is the result.
As you see, the values of the x axis have changed and they moved one grid to the left. I want them to be the same as they were in the first graph, since it shows how long the agriculture ministers have lasted in Peru in the last 20 years. I tried with this in the scale_x_continuos line:
scale_x_continuous(expand = c(.01, .01), breaks=c(2000, 2005, 2010, 2015, 2020, 2025))
As a consequence, the all the x axis values disappeared. They also disappeared when I tried with this:
scale_x_continuous(expand = c(.01, .01), breaks=seq(2000,2025,5000))
Hope you can help me to solve the problem.
Additional info
They suggested me to show some of the data I used for the plot. This is the output of
dput(head(df2, 20))
Note that in the plot I'm not using the variables "Sexo", "Fecha.nacimiento", "Días.cargo", "year", "month" and "day".
structure(list(Fecha.inicio = structure(c(19149, 19134, 19031,
19024, 18837, 18584, 18578, 18172, 17966, 17623, 17540, 17010,
16125, 15544, 15319, 15183, 15107, 14866, 14436, 14166), class = "Date"),
Sector = c("Agricultura", "Agricultura", "Agricultura", "Agricultura",
"Agricultura", "Agricultura", "Agricultura", "Agricultura",
"Agricultura", "Agricultura", "Agricultura", "Agricultura",
"Agricultura", "Agricultura", "Agricultura", "Agricultura",
"Agricultura", "Agricultura", "Agricultura", "Agricultura"
), Ministro.a = structure(1:20, .Label = c("A. Alencastre",
"J. Arce", "O. Zea", "A. Ramos", "V. Mayta", "F. Tenorio",
"F. Hurtado", "J. L. Montenegro", "F. Muñoz", "G. Mostajo",
"J. Arista", "J. M. Hernández", "J. M. Benites", "M. von Hesse",
"L. Ginocchio", "M. Caillaux", "J. Villasante", "R. Quevedo",
"A. de Córdoba", "C. Leyton", "I. Benavides", "J. J. Salazar",
"M. Manrique", "Á. Quijandría/2", "J. León", "E. Gonzáles",
"Á. Quijandría/1"), class = "factor"), Fecha.nacimiento = c("",
"31/07/1970", "13/08/1973", "", "", "10/03/1957", "", "23/10/1965",
"27/11/1971", "18/08/1973", "18/08/1959", "27/08/1948", "21/06/1967",
"10/09/1964", "25/04/1954", "23/09/1950", "7/01/1962", "",
"27/09/1950", "8/01/1952"), Sexo = c("Hombre", "Hombre",
"Hombre", "Hombre", "Hombre", "Hombre", "Hombre", "Hombre",
"Mujer", "Hombre", "Hombre", "Hombre", "Hombre", "Hombre",
"Hombre", "Hombre", "Hombre", "Hombre", "Hombre", "Hombre"
), Presidente = c("Castillo", "Castillo", "Castillo", "Castillo",
"Castillo", "De PPk a Sagasti", "De PPk a Sagasti", "De PPk a Sagasti",
"De PPk a Sagasti", "De PPk a Sagasti", "De PPk a Sagasti",
"De PPk a Sagasti", "Humala", "Humala", "Humala", "Humala",
"García", "García", "García", "García"), year = c(2022, 2022,
2022, 2022, 2021, 2020, 2020, 2019, 2019, 2018, 2018, 2016,
2014, 2012, 2011, 2011, 2011, 2010, 2009, 2008), month = c(6,
5, 2, 2, 7, 11, 11, 10, 3, 4, 1, 7, 2, 7, 12, 7, 5, 9, 7,
10), day = c(6L, 22L, 8L, 1L, 29L, 18L, 12L, 3L, 11L, 2L,
9L, 28L, 24L, 23L, 11L, 28L, 13L, 14L, 11L, 14L), Fecha.final = structure(c(19194,
19149, 19134, 19031, 19024, 18837, 18584, 18578, 18172, 17966,
17623, 17540, 17010, 16125, 15544, 15319, 15183, 15107, 14866,
14436), class = "Date"), Días.Cargo = c(45, 15, 103, 7, 187,
253, 6, 406, 206, 343, 83, 530, 885, 581, 225, 136, 76, 241,
430, 270), numero = 1:20), row.names = c(3L, 6L, 15L, 24L,
49L, 72L, 91L, 135L, 149L, 173L, 192L, 221L, 253L, 272L, 280L,
297L, 308L, 318L, 331L, 344L), class = "data.frame")
Here is a way.
Since an object of class "Date" is represented in the x axis, use an appropriate date scale to customize it. With scale_x_date date breaks and labels can be set using an intuitive syntax.
I have also created a custom theme, simplifying the code of the plot itself.
library(ggplot2)
library(viridis)
#> Loading required package: viridisLite
theme_ministros_de_Agricultura <- function(){
theme_minimal() %+replace% #
theme(
plot.title = element_text(size = 10, face="bold", hjust = 0.5),
plot.caption= element_text(size = 7, color="black", face = "bold",
hjust = -0.28, vjust = 1),
legend.title = element_text(size = 9),
legend.position = c(0.8, 0.8),
legend.background = element_rect(colour = "darkblue", fill = "white",
linetype = "solid", size = 0.05),
legend.text = element_text(size = 9, colour = "black"),
legend.key.height = unit(0.1, 'cm'),
legend.key.width = unit(0.3, 'cm'),
legend.key.size = unit(0.2, 'cm'),
axis.title.y = element_blank(),
axis.text.y = element_text(size = 8),
axis.text.x = element_text(size = 8),
axis.ticks.length = unit(0, "mm")
)
}
ggplot(df2, aes(x = Fecha.inicio, xend = Fecha.final, y = Ministro.a, yend = Ministro.a, color = Presidente)) +
geom_segment(size=3) + # increase line width of segments in the chart
# x axis is a date class
scale_x_date(
expand = c(.01, .01),
date_breaks = "2 years",
date_labels = "%Y"
) +
scale_fill_identity(guide = "none") +
scale_color_viridis(
discrete = TRUE,
breaks = c("Toledo", "García", "Humala", "De PPk a Sagasti","Castillo")
)+
labs(
x="",
title = "Perú. Duración de los ministros de Agricultura (2001-2022)",
caption = " Elaboración: Ivan Ramírez (#peonpasado_)\n Fecha: 19/07/2022"
) +
theme_ministros_de_Agricultura()
Created on 2022-07-21 by the reprex package (v2.0.1)

Trying to replicate a visualisation in R

Relatively inexperienced R user. I am trying to create something similar to the visualisation below with data for another country.
I've gone as far as creating the basic structure with data plotted in a vertical annual timeline with months running along the x axis but I have no idea how to edit the individual data points. I would appreciate any idea on how to move forward or even a completely different approach.
Here is my code using ggplot2:
p <- ggplot(forestfiresv, aes(y=year, x=dtstart))
p+geom_point() +
scale_x_datetime(lim=as.POSIXct(c("2021-01-01 00:01","2021-12-31 00:00", origin=lubridate::origin), "%m/%d %H:%M",tz="UTC"),expand = c(0,0), date_breaks="2 months", labels = date_format("%b"))+
theme_bw()
A data sample:
structure(list(year = c("2000", "2000", "2000", "2000", "2000",
"2000", "2000", "2000", "2000", "2000"), `Start date` = structure(c(11174, 11167, 11166, 11191,
11222, 11144, 11151, 11192, 11244, 11187), class = "Date"), `Start time` = c("02:15",
"16:05", "10:47", "15:41", "23:30", "15:29", "14:00", "13:53",
"17:39", "11:09"), `End date` = structure(c(11174,
11178, 11166, 11192, 11223, 11146, 11152, 11197, 11244, 11191
), class = "Date"), `End time` = c("14:00", "07:00", "19:00",
"22:00", "02:00", "12:00", "00:10", "13:30", "19:07", "11:30"
), Δάση = c(200, 1400, 400, 0, 0, 0, 600, 2000, 0, 260), `Forest` = c(800,
0, 0, 100, 100, 700, 0, 0, 0, 0), `Agricultural land` = c(0, 0, 0, 200, 0, 0, 200, 500, 0, 0), totalareaburnt = c(1000, 1400, 400, 500, 500, 700, 800, 2500, 350, 360), dtstart = structure(c(1628129700, 1627574700, 1627469220, 1629646860, 1632353400, 1625585340, 1626184800, 1629726780, 1634233140, 1629284940), class = c("POSIXct", "POSIXt"), tzone = "UTC"), dtend = structure(c(1628172000, 1628492400, 1627498800, 1629756000, 1632362400, 1625745600, 1626221400, 1630157400, 1634238420, 1629631800), class = c("POSIXct", "POSIXt"), tzone = "UTC")), .internal.selfref = <pointer: (nil)>, row.names = c(NA, 10L), class = c("data.table", "data.frame"))
This is the best I've obtained so far, but I bet it could be better. I've increased your example data frame because there was only one year of observation and I've injected some randomness to make the plot look better.
library(ggplot2)
ddf <- rbind(df,df,df,df,df,df,df,df,df,df)
ddf$year <- rep(2000:2009,each=10)
ddf$totalareaburnt <- sample(200:2500,100,replace = T)
ddf$dtstart <- ddf$dtstart+sample(86400*1:90,100,replace = T)
#duration in days
ddf$duration <- as.numeric(df$dtend-df$dtstart)/24
ddf$year <- as.integer(ddf$year)
ggplot(ddf,
aes(y = year,
x = dtstart)) +
geom_point(aes(size = totalareaburnt,
col = duration),
shape = 17,
alpha = 0.7) +
scale_x_datetime(
lim = as.POSIXct(
c("2021-01-01 00:01", "2021-12-31 00:00", origin = lubridate::origin),
"%m/%d %H:%M",
tz = "UTC"
),
expand = c(0, 0),
date_breaks = "1 months",
labels = scales::date_format("%b")
) +
theme_minimal() +
theme(
legend.position = "top",
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.line = element_line(),
axis.ticks = element_line()
) +
scale_y_continuous(trans = "reverse", breaks = unique(ddf$year))+
scale_colour_gradientn(name= "Duartion (day)",colours = c( "yellow", "orange","darkred"))+
scale_size_continuous(name="Area burned (ha)")

Adding p-values to ggplot; ggsignif says it can only handle data with groups that are plotted on the x-axis

I have data as follows, to which I am trying to add p-values:
library(ggplot2)
library(ggsignif)
library(dplyr)
data <- structure(list(treatment = c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1), New_Compare_Truth = c(57,
61, 12, 14, 141, 87, 104, 90, 12, 14), total_Hy = c(135,
168, 9, 15, 103, 83, 238, 251, 9, 15), total = c(285, 305, 60,
70, 705, 435, 520, 450, 60, 70), ratio = c(47.3684210526316,
55.0819672131148, 15, 21.4285714285714, 14.6099290780142, 19.0804597701149,
45.7692307692308, 55.7777777777778, 15, 21.4285714285714), Type = structure(c(2L,
2L, 1L, 1L, 3L, 3L, 5L, 5L, 4L, 4L), .Label = c("A1. Others \nMore \nH",
"A2. Similar \nNorm", "A3. Others \nLess \nH", "B1. Others \nMore \nH",
"B2. Similar \nNorm or \nHigher"), class = "factor"), `Sample Selection` = c("Answers pr",
"Answers pu", "Answers pr", "Answers pu", "Answers pr",
"Answers pu", "Answers pr", "Answers pu", "Answers pr",
"Answers pu"), p_value = c(0.0610371842601616, 0.0610371842601616,
0.346302201593934, 0.346302201593934, 0.0472159407450147, 0.0472159407450147,
0.0018764377521242, 0.0018764377521242, 0.346302201593934, 0.346302201593934
), x = c(2, 2, 1, 1, 3, 3, 5.5, 5.5, 4.5, 4.5)), row.names = c(NA,
-10L), class = c("data.table", "data.frame"))
breaks_labels <- structure(list(Type = structure(c(2L, 1L, 3L, 5L, 4L), .Label = c("A1. Others \nMore \nH",
"A2. Similar \nNorm", "A3. Others \nLess \nH", "B1. Others \nMore \nH",
"B2. Similar \nNorm or \nHigher"), class = "factor"), x = c(2,
1, 3, 5.5, 4.5)), row.names = c(NA, -5L), class = c("data.table",
"data.frame"))
data %>%
ggplot(aes(x = x, y = ratio)) +
geom_col(aes(fill = `Sample Selection`), position = position_dodge(preserve = "single"), na.rm = TRUE) +
geom_text(position = position_dodge(width = .9), # move to center of bars
aes(label=sprintf("%.02f %%", round(ratio, digits = 1)), group = `Sample Selection`),
vjust = -1.5, # nudge above top of bar
size = 4,
na.rm = TRUE) +
# geom_text(position = position_dodge(width = .9), # move to center of bars
# aes(label= paste0("(", ifelse(variable == "Crime = 0", `Observation for Crime = 0`, `Observation for Crime = 1`), ")"), group = `Sample Selection`),
# vjust = -0.6, # nudge above top of bar
# size = 4,
# na.rm = TRUE) +
scale_fill_grey(start = 0.8, end = 0.5) +
scale_y_continuous(expand = expansion(mult = c(0, .1))) +
scale_x_continuous(breaks = breaks_labels$x, labels = breaks_labels$Type) +
theme_bw(base_size = 15) +
xlab("Norm group for corporate Hy") +
ylab("Percentage Compliant Decisions") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_signif(annotation=c("p=0.35", "p=0.06", "p=0.05", "p=0.34", "p=0.00"), y_position = c(30, 40, 55 ,75, 90), xmin=c(0.75,1.75,2.75,3.75,4.75),
xmax=c(1.25,2.25,3.25,4.25,5.25))
For some reason, the last line causes the following error:
Error in f(...) :
Can only handle data with groups that are plotted on the x-axis
Since I am just putting in text and not referring to any variable, I don't really understand why this happens. Can anyone help me out? Without the last line it looks like this:
EDIT: Please note that I would like to keep the space between the third and the fourth column (which is apparently also what caused the problem, see Jared's answer).
Edit
Thanks for clarifying your expected outcome. Here is one way to include geom_signif() annotations without altering the original plot:
library(tidyverse)
library(ggsignif)
data <- structure(list(treatment = c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1), New_Compare_Truth = c(57,
61, 12, 14, 141, 87, 104, 90, 12, 14), total_Hy = c(135,
168, 9, 15, 103, 83, 238, 251, 9, 15), total = c(285, 305, 60,
70, 705, 435, 520, 450, 60, 70), ratio = c(47.3684210526316,
55.0819672131148, 15, 21.4285714285714, 14.6099290780142, 19.0804597701149,
45.7692307692308, 55.7777777777778, 15, 21.4285714285714), Type = structure(c(2L,
2L, 1L, 1L, 3L, 3L, 5L, 5L, 4L, 4L), .Label = c("A1. Others \nMore \nH",
"A2. Similar \nNorm", "A3. Others \nLess \nH", "B1. Others \nMore \nH",
"B2. Similar \nNorm or \nHigher"), class = "factor"), `Sample Selection` = c("Answers pr",
"Answers pu", "Answers pr", "Answers pu", "Answers pr",
"Answers pu", "Answers pr", "Answers pu", "Answers pr",
"Answers pu"), p_value = c(0.0610371842601616, 0.0610371842601616,
0.346302201593934, 0.346302201593934, 0.0472159407450147, 0.0472159407450147,
0.0018764377521242, 0.0018764377521242, 0.346302201593934, 0.346302201593934
), x = c(2, 2, 1, 1, 3, 3, 5.5, 5.5, 4.5, 4.5)), row.names = c(NA,
-10L), class = c("data.table", "data.frame"))
breaks_labels <- structure(list(Type = structure(c(2L, 1L, 3L, 5L, 4L), .Label = c("A1. Others \nMore \nH",
"A2. Similar \nNorm", "A3. Others \nLess \nH", "B1. Others \nMore \nH",
"B2. Similar \nNorm or \nHigher"), class = "factor"), x = c(2,
1, 3, 5.5, 4.5)), row.names = c(NA, -5L), class = c("data.table",
"data.frame"))
annotation_df <- data.frame(signif = c("p=0.35", "p=0.06", "p=0.05", "p=0.34", "p=0.00"),
y_position = c(30, 40, 55 ,75, 90),
xmin = c(0.75,1.75,2.75,4.25,5.25),
xmax = c(1.25,2.25,3.25,4.75,5.75),
group = c(1,2,3,4,5))
data %>%
ggplot(aes(x = x, y = ratio, group = `Sample Selection`)) +
geom_col(aes(fill = `Sample Selection`),
position = position_dodge(preserve = "single"), na.rm = TRUE) +
geom_text(position = position_dodge(width = .9), # move to center of bars
aes(label=sprintf("%.02f %%", round(ratio, digits = 1))),
vjust = -1.5, # nudge above top of bar
size = 4,
na.rm = TRUE) +
scale_fill_grey(start = 0.8, end = 0.5) +
scale_y_continuous(expand = expansion(mult = c(0, .1))) +
scale_x_continuous(breaks = breaks_labels$x, labels = breaks_labels$Type) +
theme_bw(base_size = 15) +
xlab("Norm group for corporate Hy") +
ylab("Percentage Compliant Decisions") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_signif(aes(xmin = xmin,
xmax = xmax,
y_position = y_position,
annotations = signif,
group = group),
data = annotation_df, manual = TRUE)
#> Warning: Ignoring unknown aesthetics: xmin, xmax, y_position, annotations
Created on 2021-07-20 by the reprex package (v2.0.0)
Previous answer
One potential solution to your problem is to plot "Type" on the x axis instead of "x", e.g.
data %>%
ggplot(aes(x = Type, y = ratio)) +
geom_col(aes(fill = `Sample Selection`),
position = position_dodge(preserve = "single"), na.rm = TRUE) +
geom_text(position = position_dodge(width = .9), # move to center of bars
aes(label=sprintf("%.02f %%", round(ratio, digits = 1)),
group = `Sample Selection`),
vjust = -1.5,
size = 4,
na.rm = TRUE) +
scale_fill_grey(start = 0.8, end = 0.5) +
scale_y_continuous(expand = expansion(mult = c(0, .1))) +
theme_bw(base_size = 15) +
xlab("Norm group for corporate Hy") +
ylab("Percentage Compliant Decisions") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_signif(annotation=c("p=0.35", "p=0.06", "p=0.05", "p=0.34", "p=0.00"),
y_position = c(30, 40, 55 ,75, 90),
xmin=c(0.75,1.75,2.75,3.75,4.75),
xmax=c(1.25,2.25,3.25,4.25,5.25))

How can I combine color and shape identity with ggplot2?

I get 2 different legends one for shape and one for color. I've read: Combine legends for color and shape into a single legend already but I have no clue how to combine a color/shape identity together.
My Data:
It calls Vergleich2
My code:
Vergleich2 <- data.frame(
list(
RH = c(4.4, 70.81, 89.74, 98.21, 99.45, 100.3, 101.16, 101.83, 103.46, 103.65, 103.9, 33.37, 32.26, 50.39, 75.65, 81.54, 86.58, 91.88, 94.1, 96.41, 98.52, 99.93, 101.45, 77.09, 84.51, 92.15, 94.61, 96.22, 97.36, 98.85, 98.95, 98.74, 99.34, 100.07, 101.06, 102.45, 103.04),
max = c(0, 0.0262005491707849,
0.0960002914076637, 0.26123554979527, 0.299421329851762, 0.362190635901956, 0.452267730725373, 0.60803295055093, 0.958096790371026, 0.995440372287362, 1, 0.0191985206504361, 0, 0.0444427600652313, 0.0676200802520531, 0.0922989569990268, 0.112052964622176, 0.182215180712429, 0.248659241123121, 0.327097853193048, 0.496708233033155, 0.627302705113058, 1, 0.515522981617377, 0.585402158506993, 0.762678213109326, 0.920738889764711, 0.836214001953324, 0.871654063266438, 0.908503395902539, 0.825786584233689, 0.875522664077668, 0.831158954459146, 0.831533205795933, 0.893700430247523, 1.00803637031109,
1), letzte = c(0, 0.0171373807524096, 0.0818334694345387, 0.239981280241844, 0.280068579568638, 0.345939316999413, 0.434432925347285, 0.611502937955804, 0.964279264750348, 1.00834862405373, 1, 0.00678220086610785, 0, -0.00307024455552525, 0.0255053593718935, 0.0748980985479396, 0.0890155980480638, 0.153017148428967, 0.187262260262659, 0.306449913424004,
0.454599256084893, 0.614943073105356, 1, 0.527873986434174, 0.593334258062775, 0.768834444991388, 1.21440714508987, 0.847592976104216, 0.892496700707447, 0.917439391188656, 0.834935302471757, 0.840806889095709, 0.823590477107656,
0.834511976098586, 0.912778381850167, 1.00642363306524, 1)))
Region1_plot <- ggplot()+
geom_point(data=Vergleich2[c(1:11),], mapping=aes(x=RH,y=max,col="red",shape=19))+
geom_point(data=Vergleich2[c(1:11),], mapping=aes(x=RH,y=letzte, col="blue",shape=19))+
geom_point(data=Vergleich2[c(12:13),], mapping=aes(x=RH,y=max,col="red",shape=3))+
geom_point(data=Vergleich2[c(12:13),], mapping=aes(x=RH,y=letzte, col="blue",shape=3))+
geom_point(data=Vergleich2[c(14:23),], mapping=aes(x=RH,y=max,col="red",shape=6))+
geom_point(data=Vergleich2[c(14:23),], mapping=aes(x=RH,y=letzte, col="blue",shape=6))+
scale_color_identity("", guide="legend", breaks=c("red","blue"),labels=c("Incr.1","Decr.1"))+
scale_shape_identity("", guide = "legend", breaks=c(19,3,6), labels=c("Incr.1","Decr.1", "Incr.2"))+
labs(title = "Relative Wasseraufnahme Isopren SOA #10 (RH)", title.position="center")+
ylab("Norm. Optical Density [-]")+
xlab("RH [%]")+
coord_cartesian(xlim = c(0, 100))+
scale_x_continuous(breaks=c(seq(0,120, 4)))+
theme(axis.text = element_text(size = 15),
axis.title = element_text(size=15),
plot.title = element_text(hjust = 0.5),
legend.title = element_text(size=0),
legend.text= element_text(size=15),
legend.background = element_rect(),
legend.position = c(0.095,0.9),
title = element_text(size=20))
print(Region1_plot)
Thx for your help!!
Per my experience it makes to shape data so that you can keep ggplot calls as simple as possible. The various geom_points hint at a problem with your input data. Here's a proposal how to add a column that contains a combination of the attributes you want to show:
library(tidyverse)
Vergleich2 <- data.frame(
list(
RH = c(4.4, 70.81, 89.74, 98.21, 99.45, 100.3, 101.16, 101.83, 103.46, 103.65, 103.9, 33.37, 32.26, 50.39, 75.65, 81.54, 86.58, 91.88, 94.1, 96.41, 98.52, 99.93, 101.45, 77.09, 84.51, 92.15, 94.61, 96.22, 97.36, 98.85, 98.95, 98.74, 99.34, 100.07, 101.06, 102.45, 103.04),
max = c(0, 0.0262005491707849,
0.0960002914076637, 0.26123554979527, 0.299421329851762, 0.362190635901956, 0.452267730725373, 0.60803295055093, 0.958096790371026, 0.995440372287362, 1, 0.0191985206504361, 0, 0.0444427600652313, 0.0676200802520531, 0.0922989569990268, 0.112052964622176, 0.182215180712429, 0.248659241123121, 0.327097853193048, 0.496708233033155, 0.627302705113058, 1, 0.515522981617377, 0.585402158506993, 0.762678213109326, 0.920738889764711, 0.836214001953324, 0.871654063266438, 0.908503395902539, 0.825786584233689, 0.875522664077668, 0.831158954459146, 0.831533205795933, 0.893700430247523, 1.00803637031109,
1), letzte = c(0, 0.0171373807524096, 0.0818334694345387, 0.239981280241844, 0.280068579568638, 0.345939316999413, 0.434432925347285, 0.611502937955804, 0.964279264750348, 1.00834862405373, 1, 0.00678220086610785, 0, -0.00307024455552525, 0.0255053593718935, 0.0748980985479396, 0.0890155980480638, 0.153017148428967, 0.187262260262659, 0.306449913424004,
0.454599256084893, 0.614943073105356, 1, 0.527873986434174, 0.593334258062775, 0.768834444991388, 1.21440714508987, 0.847592976104216, 0.892496700707447, 0.917439391188656, 0.834935302471757, 0.840806889095709, 0.823590477107656,
0.834511976098586, 0.912778381850167, 1.00642363306524, 1)))
plot_df <- Vergleich2[1:23,] ## above you plot a subset of the data - that's why I'm choosing columns 1:23
plot_df <- plot_df %>%
mutate(shapes = c(rep("Incr.1", 11), rep("Decr.1", 2), rep("Incr.2", 10))) %>% ## adding the attribute for shapes
pivot_longer(cols = c("max", "letzte"), names_to = "colrs") %>% ## tidying data (a format that is ggplot-friendly)
mutate(combined = paste(shapes, colrs)) ## and combining the columns so that I can use one column for both shape and colour
ggplot(plot_df, aes(x = RH, y = value, shape = combined, colour = combined))+
geom_point() +
scale_color_manual("", values = c("red", "blue", "red", "blue", "red", "blue"))+
scale_shape_manual("", values = c(19, 3, 6, 19, 3, 6))+
labs(title = "Relative Wasseraufnahme Isopren SOA #10 (RH)", title.position="center")+
ylab("Norm. Optical Density [-]")+
xlab("RH [%]")+
coord_cartesian(xlim = c(0, 100))+
scale_x_continuous(breaks=c(seq(0,120, 4)))+
theme(axis.text = element_text(size = 15),
axis.title = element_text(size=15),
plot.title = element_text(hjust = 0.5),
legend.title = element_text(size=0),
legend.text= element_text(size=15),
legend.background = element_rect(),
legend.position = c(0.095,0.9),
title = element_text(size=20))
See Hadley Wickham's book on "[tidy data][1]"

How to add abline, without so much programming?

Do you know if there is a way to reduce the programming lines.
abline(v = c(1990,1991,1992,1993,1994, 1995,1996,
1997,1998, 1999, 2000, 2001, 2002, 2003,
2004,2005,2006,2007,2008,2009,2010,
2011,2012,2013,2014,2015,2016,2017),
col = c("red","red","red","red","red","red","red",
"red","red","red","red","red","red","red",
"red","red","red","red","red","red","red",
"red","red","red","red","red","red","red"),
lty = c(2,2,2,2,2,2,2,
2,2,2,2,2,2,2,
2,2,2,2,2,2,2,
2,2,2,2,2,2,2),
lwd = c(1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1),
h = c(200,400,600,800,1000))
Like this?
abline(v = seq(1990, 2017, 1),
col = rep("red", 28),
lty = rep(2, 28),
lwd = rep(1, 28),
h = seq(200, 1000, 200))

Resources