I would like to display months (in abbreviated form) along the horizontal axis, with the corresponding year printed once. I know how to display month-year:
The un-needed repetition of the year clutters the labels. Instead I would like something like this:
except that the year would be printed below the months.
I printed the year above the axis labels, because that's the best I could do. This follows a limitation of the annotate() function, which gets clipped if it lies outside of the plot area. I am aware of possible workarounds based on annotate_custom(), but I couldn't make them to work with date objects (I did not try to convert dates to numbers and back to dates again, as it seemed more complicated than hopefully necessary)
I'm wondering if the new dup_axis() could be hijacked for this purpose. If instead of sending the duplicated axis to the opposite side of the panel, it could send it a few lines below the duplicated axis, then perhaps it would just be a matter of setting up one axis with panel.grid.major blanked out and the labels set to %b, while the other axis would have panel.grid.minor blanked out and the labels set to %Y. (an added challenge is that the year labels would be shifted to October instead of January)
These questions are related. However, the annotate_custom() function and textGrob() functions do not play well with dates, as far as I can tell.
how-can-i-add-annotations-below-the-x-axis-in-ggplot2
displaying-text-below-the-plot-generated-by-ggplot2
Data and basic code below:
library("ggplot2")
library("scales")
ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b %Y")) +
xlab(NULL)
ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_minor_breaks = "2 month", labels = date_format("%b")) +
annotate(geom = "text", x = as.Date("1719-10-01"), y = 0, label = "1719") +
annotate(geom = "text", x = as.Date("1720-10-01"), y = 0, label = "1720") +
xlab(NULL)
# data
df <- structure(list(Date = structure(c(-91455, -91454, -91453, -91452,
-91451, -91450, -91448, -91447, -91446, -91445, -91444, -91443,
-91441, -91440, -91439, -91438, -91437, -91436, -91434, -91433,
-91431, -91430, -91429, -91427, -91426, -91425, -91424, -91423,
-91422, -91420, -91419, -91418, -91417, -91416, -91415, -91413,
-91412, -91411, -91410, -91409, -91408, -91406, -91405, -91404,
-91403, -91402, -91401, -91399, -91398, -91397, -91396, -91395,
-91394, -91392, -91391, -91390, -91389, -91388, -91387, -91385,
-91384, -91382, -91381, -91380, -91379, -91377, -91376, -91375,
-91374, -91373, -91372, -91371, -91370, -91369, -91368, -91367,
-91366, -91364, -91363, -91362, -91361, -91360, -91359, -91357,
-91356, -91355, -91354, -91353, -91352, -91350, -91349, -91348,
-91347, -91346, -91345, -91343, -91342, -91341, -91340, -91339,
-91338, -91336, -91335, -91334, -91333, -91332, -91331, -91329,
-91328, -91327, -91326, -91325, -91324, -91322, -91321, -91320,
-91319, -91315, -91314, -91313, -91312, -91311, -91310, -91308,
-91307, -91306, -91305, -91304, -91303, -91301, -91300, -91299,
-91298, -91297, -91296, -91294, -91293, -91292, -91291, -91290,
-91289, -91287, -91286, -91285, -91284, -91283, -91282, -91280,
-91279, -91278, -91277, -91276, -91275, -91273, -91272, -91271,
-91270, -91269, -91268, -91266, -91265, -91264, -91263, -91262,
-91261, -91259, -91258, -91257, -91256, -91255, -91254, -91252,
-91251, -91250, -91249, -91248, -91247, -91245, -91244, -91243,
-91242, -91241, -91240, -91238, -91237, -91236, -91235, -91234,
-91233, -91231, -91230, -91229, -91228, -91227, -91226, -91224,
-91223, -91222, -91221, -91220, -91219, -91217, -91216, -91215,
-91214, -91213, -91212, -91210, -91209, -91208, -91207, -91205,
-91201, -91200, -91199, -91198, -91196, -91195, -91194, -91193,
-91192, -91191, -91189, -91188, -91187, -91186, -91185, -91184,
-91182, -91181, -91180, -91179, -91178, -91177, -91175, -91174,
-91173, -91172, -91171, -91170, -91168, -91167, -91166, -91165,
-91164, -91163, -91161, -91160, -91159, -91158, -91157, -91156,
-91154, -91153, -91152, -91151, -91150, -91149, -91147, -91146,
-91145, -91144, -91143, -91142, -91140, -91139, -91138, -91131,
-91130, -91129, -91128, -91126, -91125, -91124, -91123, -91122,
-91121, -91119, -91118, -91117, -91116, -91115, -91114, -91112,
-91111, -91110, -91109, -91108, -91107, -91104, -91103, -91102,
-91101, -91100, -91099, -91097, -91096, -91095, -91094, -91093,
-91091, -91090, -91089, -91088, -91087, -91086, -91084, -91083,
-91082, -91081, -91080, -91079, -91077, -91076, -91075, -91074,
-91073, -91072, -91070, -91069, -91068, -91065, -91063, -91062,
-91061, -91060, -91059, -91058, -91056, -91055, -91054, -91053,
-91052, -91051, -91049, -91048, -91047, -91046, -91045, -91044,
-91042, -91041, -91040, -91039, -91038, -91037, -91035, -91034,
-91033, -91032, -91031, -91030, -91028, -91027, -91026, -91025,
-91024, -91023, -91021, -91020, -91019, -91018, -91017, -91016,
-91014, -91013, -91012, -91011, -91010, -91009, -91007, -91006,
-91005, -91004, -91003, -91002, -91000, -90999, -90998, -90997,
-90996, -90995, -90993, -90992, -90991, -90990, -90989, -90988,
-90986, -90985, -90984, -90983, -90982), class = "Date"), value = c(113,
113, 113, 113, 114, 114, 114, 115, 115, 115, 116, 116, 116, 116,
117, 117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117,
117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117, 117,
117, 117, 116, 116, 117, 117, 117, 117, 117, 117, 117, 117, 117,
117, 117, 118, 118, 118, 118, 117, 118, 117, 117, 117, 117, 117,
117, 118, 116, 116, 116, 116, 116, 116, 116, 117, 117, 118, 118,
118, 118, 118, 119, 120, 120, 119, 119, 120, 120, 121, 121, 122,
124, 124, 122, 123, 124, 123, 123, 123, 123, 123, 124, 124, 126,
126, 126, 126, 126, 125, 125, 126, 127, 126, 126, 125, 126, 126,
126, 128, 128, 128, 130, 133, 131, 133, 134, 134, 134, 136, 136,
136, 135, 135, 135, 136, 136, 136, 136, 135, 135, 135, 135, 130,
129, 129, 130, 131, 136, 138, 155, 157, 161, 170, 174, 168, 165,
169, 171, 181, 184, 182, 179, 181, 179, 175, 177, 177, 174, 170,
174, 173, 178, 173, 178, 179, 182, 184, 184, 180, 181, 182, 182,
184, 184, 188, 195, 198, 220, 255, 275, 350, 310, 315, 320, 320,
316, 300, 310, 310, 320, 317, 313, 312, 310, 297, 285, 285, 286,
288, 315, 328, 338, 344, 345, 352, 352, 342, 335, 343, 340, 342,
339, 337, 336, 336, 342, 347, 352, 352, 351, 352, 352, 351, 352,
352, 355, 375, 400, 452, 487, 476, 475, 473, 485, 500, 530, 595,
720, 720, 770, 750, 770, 750, 735, 740, 745, 735, 700, 700, 750,
760, 755, 755, 760, 760, 765, 950, 950, 950, 875, 875, 875, 880,
880, 880, 900, 900, 900, 880, 880, 890, 895, 890, 880, 870, 870,
870, 870, 870, 860, 860, 860, 860, 850, 840, 810, 820, 810, 810,
805, 810, 805, 820, 815, 820, 805, 790, 800, 780, 760, 765, 750,
740, 820, 810, 800, 800, 775, 750, 810, 750, 740, 700, 705, 660,
630, 640, 595, 590, 570, 565, 535, 440, 400, 410, 400, 405, 390,
370, 300, 300, 180, 200, 310, 290, 260, 260, 275, 260, 270, 265,
255, 250, 210, 210, 200, 195, 210, 215, 240, 240, 220, 220, 220,
220, 210, 212, 208, 220, 210, 212, 208, 220, 215, 220, 214, 214,
213, 212, 210, 210, 195, 195, 160, 160, 175, 205, 210, 208, 197,
181, 185)), .Names = c("Date", "value"), row.names = c(NA, 393L
), class = "data.frame")
The code below provides two potential options for adding year labels.
Option 1a: Faceting
You could use faceting to mark the years. For example:
library(ggplot2)
library(lubridate)
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_rect(fill=NA,colour="grey50"),
panel.spacing=unit(0,"cm"))
Note that with this approach, if there are missing dates at the beginning or end of a year (by "missing", I mean rows for those dates are not even present in the data) then the x-axis will start/end at the first/last date in the data for that year, rather than go from Jan-1 to Dec-31. In that case, you'd need to add in rows for the missing dates and either NA for value or interpolate value. In addition, with this method there is no space or line between December 31 of one year and January 1 of the next year, so there's a discontinuity across each year.
Option 1b: Faceting + centered month labels
To address #AF7's comment. You can center the month labels by adding some spaces before each label. But you have to choose the number of spaces manually, depending on the physical size of the plot when you print it to a device. (There's probably a way to center the labels programmatically based on the internal grob measurements, but I'm not sure how to do it.) I've also removed the minor vertical gridlines and lightened the line between years.
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_labels=paste(c(rep(" ",11), "%b"), collapse=""),
date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_blank(),
panel.grid.minor.x = element_blank(),
panel.border = element_rect(colour="grey70"),
panel.spacing=unit(0,"cm"))
Option 2a: Edit the x-axis label grob
Here's a more complex and finicky method (though it could likely be automated by someone who understands the structure and unit spacings of grid graphics better than I do) that avoids the pitfalls of the faceting method described above:
library(grid)
# Fake data with an extra year added for illustration
set.seed(2)
df = data.frame(Date=seq(as.Date("1718-03-01"),as.Date("1721-09-20"), by="1 day"))
df$value = cumsum(rnorm(nrow(df)))
# The plot we'll start with
p = ggplot(df, aes(Date, value)) +
geom_vline(xintercept=as.numeric(df$Date[yday(df$Date)==1]), colour="grey60") +
geom_line() +
scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
theme_bw() +
theme(panel.grid.minor.x = element_blank()) +
labs(x="")
Now we want to add the year values below and in between June and July of each year. The code below does that by modifying the x-axis label grob and is adapted from this SO answer by #SandyMuspratt.
# Get the grob
g <- ggplotGrob(p)
# Get the y axis
index <- which(g$layout$name == "axis-b") # Which grob
xaxis <- g$grobs[[index]]
# Get the ticks (labels and marks)
ticks <- xaxis$children[[2]]
# Get the labels
ticksB <- ticks$grobs[[2]]
# Edit x-axis label grob
# Find every index of Jun in the x-axis labels and add a newline and
# then a year label
junes = which(ticksB$children[[1]]$label == "Jun")
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes],
"\n ", unique(year(df$Date)))
# Put the edited labels back into the plot
ticks$grobs[[2]] <- ticksB
xaxis$children[[2]] <- ticks
g$grobs[[index]] <- xaxis
# Draw the plot
grid.newpage()
grid.draw(g)
Option 2b: Edit the x-axis label grob and center the month labels
Below is the only change that needs to be made to Option 2a to center the month labels, but, once again, the number of spaces needs to be tweaked manually.
# Make the edit
# Center the month labels between ticks
ticksB$children[[1]]$label = paste0(paste(rep(" ",7),collapse=""), ticksB$children[[1]]$label)
# Find every index of Jun in the x-axis labels and a year label
junes = grep("Jun", ticksB$children[[1]]$label)
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes], "\n ", unique(year(df$Date)))
I came upon this question and thought maybe I can add a solution. We can display both month and year in every year's first displayed month by using a simple condition. You can play with the date_breaks to remove January from the labels, and this will still work. I'm using month() and year() from lubridate.
library(tidyverse)
library(lubridate)
df %>%
ggplot(aes(Date, value)) +
geom_line() +
scale_x_date(date_breaks = "2 months",
labels = function(x) if_else(is.na(lag(x)) | !year(lag(x)) == year(x),
paste(month(x, label = TRUE), "\n", year(x)),
paste(month(x, label = TRUE))))
If you want to try to hack together a sub-label, you could convert it to a grob. I edited this from the original post to create a function that adds the sublabels and returns a gtable object. Note that the sublabs input must be the same length as your x-axis breaks:
library(grid)
library(gtable)
library(gridExtra)
add_sublabs <- function(plot, sublabs){
gg <- ggplotGrob(plot)
axis_num <- which(gg$layout[,"name"] == "axis-b")
xbreaks <- gg[["grobs"]][[axis_num]][["children"]][[2]][["grobs"]][[2]][["children"]][[1]]$x
if(length(xbreaks) != length(sublabs)) stop("Sub-labels must be the same length as the x-axis breaks")
to_breaks <- c(as.numeric(xbreaks),1)[which(!duplicated(sublabs, fromLast = TRUE))+1]
sublabs_x <- diff(c(0,to_breaks))
sublabs_labels <- sublabs[!duplicated(sublabs, fromLast = TRUE)]
tg <- tableGrob(matrix(sublabs_labels, nrow = 1))
tg$widths = unit(sublabs_x, attr(xbreaks,"unit"))
pos <- gg$layout[axis_num,c("t","l")]
gg2 <- gtable_add_rows(gg, heights = sum(tg$heights)+unit(4,"mm"), pos = pos$t)
gg3 <- gtable_add_grob(gg2, tg, t = pos$t+1, l = pos$l)
return(gg3)
}
#Plot and sublabels
p <- ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b")) +
xlab(NULL)
sublabs <- c(rep("1719",2),rep("1720",6))
#Draw
grid.draw(add_sublabs(p, sublabs))
One way to avoid the complexities would be to change the required output so that January is replaced by the year.
The lab function returns the labels given the breaks. Unexpectedly, ggplot will pass NAs to it so in the first line of the function body we replace those with some date -- it does not matter which date since such values are not subsequently used by ggplot. Finally we format the date as a year or abbreviated month depending on whether the month is January (which corresponds to the POSIXlt component mon equalling 0) or not.
library(ggplot2)
library(scales)
lab <- function(b) {
b[is.na(b)] <- Sys.Date()
format(b, ifelse(as.POSIXlt(b)$mon == 0, "%Y", "%b"))
}
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_breaks = "month", labels = lab)
Note: I have added Issue 2182 to the ggplot2 github issues list regarding the NAs that are passed to the label function. If subsequent versions of ggplot2 no longer pass the NAs then the first line of the body of lab could be omitted .
Update: fixed.
Related
How do I make the df plot look like the df1 plot? The df plot has overlap of x-axis content when going from Dec 2019 to Jan 2020. In the df plot, I tried changing expand = c(0,15) which makes the x-axis labels look much better, but the line is not connected between Dec 2019 and Jan 2020.
In df1 plot, how do I remove the border around the years and the line in-between the years on the x-axis label?
library(tidyverse)
# What I have: 2019-Dec and 2020-Jan overlap on the facet
date <- as.Date(c("2019-09-01","2019-10-01","2019-11-01","2019-12-01",
"2020-01-01","2020-02-01","2020-03-01","2020-04-01"))
year <- c(2019,2019,2019,2019,2020,2020,2020,2020)
month <- c("Sep", "Oct", "Nov", "Dec", "Jan", "Feb", "Mar", "Apr")
sales <- c(100,200,600,200,100,100,800,100)
df <- data.frame(date,year,month,sales)
ggplot(df, aes(x=date, y=sales)) +
geom_line() +
scale_x_date(date_labels=paste(c(rep(" ",11), "%b"), collapse=""),
date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(date), space="free_x", scales="free_x", switch = "x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_rect(fill=NA,colour="grey50"),
panel.spacing=unit(0,"lines"),
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
panel.margin.x=unit(0, "cm"))
#What I want
df1 <- structure(list(Date = structure(c(-91455, -91454, -91453, -91452,
-91451, -91450, -91448, -91447, -91446, -91445, -91444, -91443,
-91441, -91440, -91439, -91438, -91437, -91436, -91434, -91433,
-91431, -91430, -91429, -91427, -91426, -91425, -91424, -91423,
-91422, -91420, -91419, -91418, -91417, -91416, -91415, -91413,
-91412, -91411, -91410, -91409, -91408, -91406, -91405, -91404,
-91403, -91402, -91401, -91399, -91398, -91397, -91396, -91395,
-91394, -91392, -91391, -91390, -91389, -91388, -91387, -91385,
-91384, -91382, -91381, -91380, -91379, -91377, -91376, -91375,
-91374, -91373, -91372, -91371, -91370, -91369, -91368, -91367,
-91366, -91364, -91363, -91362, -91361, -91360, -91359, -91357,
-91356, -91355, -91354, -91353, -91352, -91350, -91349, -91348,
-91347, -91346, -91345, -91343, -91342, -91341, -91340, -91339,
-91338, -91336, -91335, -91334, -91333, -91332, -91331, -91329,
-91328, -91327, -91326, -91325, -91324, -91322, -91321, -91320,
-91319, -91315, -91314, -91313, -91312, -91311, -91310, -91308,
-91307, -91306, -91305, -91304, -91303, -91301, -91300, -91299,
-91298, -91297, -91296, -91294, -91293, -91292, -91291, -91290,
-91289, -91287, -91286, -91285, -91284, -91283, -91282, -91280,
-91279, -91278, -91277, -91276, -91275, -91273, -91272, -91271,
-91270, -91269, -91268, -91266, -91265, -91264, -91263, -91262,
-91261, -91259, -91258, -91257, -91256, -91255, -91254, -91252,
-91251, -91250, -91249, -91248, -91247, -91245, -91244, -91243,
-91242, -91241, -91240, -91238, -91237, -91236, -91235, -91234,
-91233, -91231, -91230, -91229, -91228, -91227, -91226, -91224,
-91223, -91222, -91221, -91220, -91219, -91217, -91216, -91215,
-91214, -91213, -91212, -91210, -91209, -91208, -91207, -91205,
-91201, -91200, -91199, -91198, -91196, -91195, -91194, -91193,
-91192, -91191, -91189, -91188, -91187, -91186, -91185, -91184,
-91182, -91181, -91180, -91179, -91178, -91177, -91175, -91174,
-91173, -91172, -91171, -91170, -91168, -91167, -91166, -91165,
-91164, -91163, -91161, -91160, -91159, -91158, -91157, -91156,
-91154, -91153, -91152, -91151, -91150, -91149, -91147, -91146,
-91145, -91144, -91143, -91142, -91140, -91139, -91138, -91131,
-91130, -91129, -91128, -91126, -91125, -91124, -91123, -91122,
-91121, -91119, -91118, -91117, -91116, -91115, -91114, -91112,
-91111, -91110, -91109, -91108, -91107, -91104, -91103, -91102,
-91101, -91100, -91099, -91097, -91096, -91095, -91094, -91093,
-91091, -91090, -91089, -91088, -91087, -91086, -91084, -91083,
-91082, -91081, -91080, -91079, -91077, -91076, -91075, -91074,
-91073, -91072, -91070, -91069, -91068, -91065, -91063, -91062,
-91061, -91060, -91059, -91058, -91056, -91055, -91054, -91053,
-91052, -91051, -91049, -91048, -91047, -91046, -91045, -91044,
-91042, -91041, -91040, -91039, -91038, -91037, -91035, -91034,
-91033, -91032, -91031, -91030, -91028, -91027, -91026, -91025,
-91024, -91023, -91021, -91020, -91019, -91018, -91017, -91016,
-91014, -91013, -91012, -91011, -91010, -91009, -91007, -91006,
-91005, -91004, -91003, -91002, -91000, -90999, -90998, -90997,
-90996, -90995, -90993, -90992, -90991, -90990, -90989, -90988,
-90986, -90985, -90984, -90983, -90982), class = "Date"), value = c(113,
113, 113, 113, 114, 114, 114, 115, 115, 115, 116, 116, 116, 116,
117, 117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117,
117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117, 117,
117, 117, 116, 116, 117, 117, 117, 117, 117, 117, 117, 117, 117,
117, 117, 118, 118, 118, 118, 117, 118, 117, 117, 117, 117, 117,
117, 118, 116, 116, 116, 116, 116, 116, 116, 117, 117, 118, 118,
118, 118, 118, 119, 120, 120, 119, 119, 120, 120, 121, 121, 122,
124, 124, 122, 123, 124, 123, 123, 123, 123, 123, 124, 124, 126,
126, 126, 126, 126, 125, 125, 126, 127, 126, 126, 125, 126, 126,
126, 128, 128, 128, 130, 133, 131, 133, 134, 134, 134, 136, 136,
136, 135, 135, 135, 136, 136, 136, 136, 135, 135, 135, 135, 130,
129, 129, 130, 131, 136, 138, 155, 157, 161, 170, 174, 168, 165,
169, 171, 181, 184, 182, 179, 181, 179, 175, 177, 177, 174, 170,
174, 173, 178, 173, 178, 179, 182, 184, 184, 180, 181, 182, 182,
184, 184, 188, 195, 198, 220, 255, 275, 350, 310, 315, 320, 320,
316, 300, 310, 310, 320, 317, 313, 312, 310, 297, 285, 285, 286,
288, 315, 328, 338, 344, 345, 352, 352, 342, 335, 343, 340, 342,
339, 337, 336, 336, 342, 347, 352, 352, 351, 352, 352, 351, 352,
352, 355, 375, 400, 452, 487, 476, 475, 473, 485, 500, 530, 595,
720, 720, 770, 750, 770, 750, 735, 740, 745, 735, 700, 700, 750,
760, 755, 755, 760, 760, 765, 950, 950, 950, 875, 875, 875, 880,
880, 880, 900, 900, 900, 880, 880, 890, 895, 890, 880, 870, 870,
870, 870, 870, 860, 860, 860, 860, 850, 840, 810, 820, 810, 810,
805, 810, 805, 820, 815, 820, 805, 790, 800, 780, 760, 765, 750,
740, 820, 810, 800, 800, 775, 750, 810, 750, 740, 700, 705, 660,
630, 640, 595, 590, 570, 565, 535, 440, 400, 410, 400, 405, 390,
370, 300, 300, 180, 200, 310, 290, 260, 260, 275, 260, 270, 265,
255, 250, 210, 210, 200, 195, 210, 215, 240, 240, 220, 220, 220,
220, 210, 212, 208, 220, 210, 212, 208, 220, 215, 220, 214, 214,
213, 212, 210, 210, 195, 195, 160, 160, 175, 205, 210, 208, 197,
181, 185)), row.names = c(NA, 393L), class = "data.frame")
ggplot(df1, aes(Date, value)) +
geom_line() +
scale_x_date(date_labels=paste(c(rep(" ",11), "%b"), collapse=""),
date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_rect(fill=NA,colour="grey50"),
panel.spacing=unit(0,"cm"),
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank())
For your first question. The issue is that last obs. in df1 for year 2019 is December 1st, hence that's the max. value displayed on the x axis for this panel. In contrast for your df1 it's December 31st. As a consequence the axis labels overlap and the lines do not connect. Actually the lines do not connect either in your second plot. It only looks like this.
One fix for both issues would be to add a helper observation to your df dataset which extends the scale and at the same time "connects" the line to the year 2020:
Note: While adding spaces might work an alternative approach to shift your labels would be to adjust the horizontal alignment.
library(tidyverse)
df <- add_row(df, date = as.Date("2019-12-31"), year = 2019, month = "Dec", sales = 100)
ggplot(df, aes(x = date, y = sales)) +
geom_line() +
scale_x_date(
date_labels = "%b",
date_breaks = "month",
expand = c(0, 0)
) +
facet_grid(~ year(date), space = "free_x", scales = "free_x", switch = "x") +
theme_bw() +
theme(
strip.placement = "outside",
strip.background = element_rect(fill = NA, colour = "grey50"),
panel.spacing = unit(0, "lines"),
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
panel.margin.x = unit(0, "cm"),
axis.text.x = element_text(hjust = -.1)
)
For your second question. To get rid of the border and the line between years set the color for the strip background to NA:
ggplot(df1, aes(Date, value)) +
geom_line() +
scale_x_date(
date_labels = "%b",
date_breaks = "month", expand = c(0, 0)
) +
facet_grid(~ year(Date), space = "free_x", scales = "free_x", switch = "x") +
theme_bw() +
theme(
strip.placement = "outside",
strip.background = element_rect(fill = NA, colour = NA),
panel.spacing = unit(0, "cm"),
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(hjust = -.1)
)
The graph we created should show the y column in ascending order, but it changes the scale or the order maybe??
dev.new()
library(ggplot2)
tabela <- read.table("C:/Users/pedro/OneDrive/Ambiente de Trabalho/hicp.csv", sep = ';', header = T)
tabela_bread_c = tabela[c(5, 34, 63, 92, 121, 150, 179, 208, 237, 266, 295, 324, 353), 3]
head(tabela_bread_c, n=13L)
tabela_poultry = tabela[c(18, 47, 76, 105, 134, 163, 192, 221, 250, 279, 308, 337, 366), 3]
head(tabela_poultry, n=13L)
anos = tabela[c(5, 34, 63, 92, 121, 150, 179, 208, 237, 266, 295, 324, 353), 1]
head(anos, n=13L)
dados_totais = cbind(anos, tabela_bread_c)
colnames(dados_totais) <- c("Ano", "valor")
dados_totais <- strtoi(dados_totais, base=16)
df <- data.frame(x=anos, y=tabela_bread_c)
head(df, n=13L)
ggplot(df, aes(x = x, y = y)) +
geom_point() +
labs(title="Evolução de preços") + ylab("Preço dos Produtos") + xlab("Ano e Mês")
We don't really know how to fix this
Reading from left to right along the x-axis of a Monthplot in R {stats}, I would like the first month to be June (and the last to be May). The axis is defaulting to January-to-December. How do I do make that change?
The first observation in my time series of monthly production data is June, being the start of the relevant agricultural production year. I have formatted my data as a vector of ts class, starting at June.
Using ggplot2 you could use scale_x_date to tweak the limits of your x (Date) axis:
# Generate some sample data
set.seed(2017);
df <- data.frame(
Date = seq(as.Date("2018-01-01"), by = "month", length.out = 24),
Value = runif(24))
library(ggplot2);
ggplot(df, aes(Date, Value)) +
geom_line() +
scale_x_date(
date_labels = "%b\n%Y",
date_breaks = "2 month",
limits = c(as.Date("2017-06-01"), as.Date("2019-05-31")))
The answer, I've discovered, lies in the 'phase' parameter in the monthplot arguments. For monthly data, 'phase' defaults to a January-December cycle. Compare the default monthplplot below, which uses the full AirPassengers dataset from January 1949 to December 1960, with the monthplot which follows it for the extracted 11-year window from June 1949 to May 1960. In the second monthplot, 'phase' is redefined by the variable 'cycPar'.
###------------------------------------------------------------------------------
### 1. Import data
###------------------------------------------------------------------------------
AirPassengers <- structure(c(112, 118, 132, 129, 121, 135, 148, 148, 136, 119,
104, 118, 115, 126, 141, 135, 125, 149, 170, 170, 158, 133, 114,
140, 145, 150, 178, 163, 172, 178, 199, 199, 184, 162, 146, 166,
171, 180, 193, 181, 183, 218, 230, 242, 209, 191, 172, 194, 196,
196, 236, 235, 229, 243, 264, 272, 237, 211, 180, 201, 204, 188,
235, 227, 234, 264, 302, 293, 259, 229, 203, 229, 242, 233, 267,
269, 270, 315, 364, 347, 312, 274, 237, 278, 284, 277, 317, 313,
318, 374, 413, 405, 355, 306, 271, 306, 315, 301, 356, 348, 355,
422, 465, 467, 404, 347, 305, 336, 340, 318, 362, 348, 363, 435,
491, 505, 404, 359, 310, 337, 360, 342, 406, 396, 420, 472, 548,
559, 463, 407, 362, 405, 417, 391, 419, 461, 472, 535, 622, 606,
508, 461, 390, 432), .Tsp = c(1949, 1960.91666666667, 12), class = "ts")
###------------------------------------------------------------------------------
### 2. Plot the data using default monthplot - Jan-Dec cycle
###------------------------------------------------------------------------------
monthplot(AirPassengers, main = "Monthly International Air Passenger Numbers",
xlab = "Monthly Pax Numbers Jan '49 - Dec '60",
ylab = "")
###------------------------------------------------------------------------------
### 3. Extract the window from June 1949 to May 1960 and plot on a Jun-May cycle
###------------------------------------------------------------------------------
junMayYears <- window(AirPassengers, start = c(1949, 6), end = c(1960, 5), frequency
= 12)
cycPar <- ts(rep(c("Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", "Jan", "Feb",
"Mar", "Apr", "May"), 11), start = c(1949, 6), frequency = 12)
monthplot(junMayYears,
phase = cycPar,
main = "Monthly International Air Passenger Numbers",
xlab = "Monthly Pax Numbers Jun '49 - May '60",
ylab = "")
###------------------------------------------------------------------------------
### END
###------------------------------------------------------------------------------
I would like to display months (in abbreviated form) along the horizontal axis, with the corresponding year printed once. I know how to display month-year:
The un-needed repetition of the year clutters the labels. Instead I would like something like this:
except that the year would be printed below the months.
I printed the year above the axis labels, because that's the best I could do. This follows a limitation of the annotate() function, which gets clipped if it lies outside of the plot area. I am aware of possible workarounds based on annotate_custom(), but I couldn't make them to work with date objects (I did not try to convert dates to numbers and back to dates again, as it seemed more complicated than hopefully necessary)
I'm wondering if the new dup_axis() could be hijacked for this purpose. If instead of sending the duplicated axis to the opposite side of the panel, it could send it a few lines below the duplicated axis, then perhaps it would just be a matter of setting up one axis with panel.grid.major blanked out and the labels set to %b, while the other axis would have panel.grid.minor blanked out and the labels set to %Y. (an added challenge is that the year labels would be shifted to October instead of January)
These questions are related. However, the annotate_custom() function and textGrob() functions do not play well with dates, as far as I can tell.
how-can-i-add-annotations-below-the-x-axis-in-ggplot2
displaying-text-below-the-plot-generated-by-ggplot2
Data and basic code below:
library("ggplot2")
library("scales")
ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b %Y")) +
xlab(NULL)
ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_minor_breaks = "2 month", labels = date_format("%b")) +
annotate(geom = "text", x = as.Date("1719-10-01"), y = 0, label = "1719") +
annotate(geom = "text", x = as.Date("1720-10-01"), y = 0, label = "1720") +
xlab(NULL)
# data
df <- structure(list(Date = structure(c(-91455, -91454, -91453, -91452,
-91451, -91450, -91448, -91447, -91446, -91445, -91444, -91443,
-91441, -91440, -91439, -91438, -91437, -91436, -91434, -91433,
-91431, -91430, -91429, -91427, -91426, -91425, -91424, -91423,
-91422, -91420, -91419, -91418, -91417, -91416, -91415, -91413,
-91412, -91411, -91410, -91409, -91408, -91406, -91405, -91404,
-91403, -91402, -91401, -91399, -91398, -91397, -91396, -91395,
-91394, -91392, -91391, -91390, -91389, -91388, -91387, -91385,
-91384, -91382, -91381, -91380, -91379, -91377, -91376, -91375,
-91374, -91373, -91372, -91371, -91370, -91369, -91368, -91367,
-91366, -91364, -91363, -91362, -91361, -91360, -91359, -91357,
-91356, -91355, -91354, -91353, -91352, -91350, -91349, -91348,
-91347, -91346, -91345, -91343, -91342, -91341, -91340, -91339,
-91338, -91336, -91335, -91334, -91333, -91332, -91331, -91329,
-91328, -91327, -91326, -91325, -91324, -91322, -91321, -91320,
-91319, -91315, -91314, -91313, -91312, -91311, -91310, -91308,
-91307, -91306, -91305, -91304, -91303, -91301, -91300, -91299,
-91298, -91297, -91296, -91294, -91293, -91292, -91291, -91290,
-91289, -91287, -91286, -91285, -91284, -91283, -91282, -91280,
-91279, -91278, -91277, -91276, -91275, -91273, -91272, -91271,
-91270, -91269, -91268, -91266, -91265, -91264, -91263, -91262,
-91261, -91259, -91258, -91257, -91256, -91255, -91254, -91252,
-91251, -91250, -91249, -91248, -91247, -91245, -91244, -91243,
-91242, -91241, -91240, -91238, -91237, -91236, -91235, -91234,
-91233, -91231, -91230, -91229, -91228, -91227, -91226, -91224,
-91223, -91222, -91221, -91220, -91219, -91217, -91216, -91215,
-91214, -91213, -91212, -91210, -91209, -91208, -91207, -91205,
-91201, -91200, -91199, -91198, -91196, -91195, -91194, -91193,
-91192, -91191, -91189, -91188, -91187, -91186, -91185, -91184,
-91182, -91181, -91180, -91179, -91178, -91177, -91175, -91174,
-91173, -91172, -91171, -91170, -91168, -91167, -91166, -91165,
-91164, -91163, -91161, -91160, -91159, -91158, -91157, -91156,
-91154, -91153, -91152, -91151, -91150, -91149, -91147, -91146,
-91145, -91144, -91143, -91142, -91140, -91139, -91138, -91131,
-91130, -91129, -91128, -91126, -91125, -91124, -91123, -91122,
-91121, -91119, -91118, -91117, -91116, -91115, -91114, -91112,
-91111, -91110, -91109, -91108, -91107, -91104, -91103, -91102,
-91101, -91100, -91099, -91097, -91096, -91095, -91094, -91093,
-91091, -91090, -91089, -91088, -91087, -91086, -91084, -91083,
-91082, -91081, -91080, -91079, -91077, -91076, -91075, -91074,
-91073, -91072, -91070, -91069, -91068, -91065, -91063, -91062,
-91061, -91060, -91059, -91058, -91056, -91055, -91054, -91053,
-91052, -91051, -91049, -91048, -91047, -91046, -91045, -91044,
-91042, -91041, -91040, -91039, -91038, -91037, -91035, -91034,
-91033, -91032, -91031, -91030, -91028, -91027, -91026, -91025,
-91024, -91023, -91021, -91020, -91019, -91018, -91017, -91016,
-91014, -91013, -91012, -91011, -91010, -91009, -91007, -91006,
-91005, -91004, -91003, -91002, -91000, -90999, -90998, -90997,
-90996, -90995, -90993, -90992, -90991, -90990, -90989, -90988,
-90986, -90985, -90984, -90983, -90982), class = "Date"), value = c(113,
113, 113, 113, 114, 114, 114, 115, 115, 115, 116, 116, 116, 116,
117, 117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117,
117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117, 117,
117, 117, 116, 116, 117, 117, 117, 117, 117, 117, 117, 117, 117,
117, 117, 118, 118, 118, 118, 117, 118, 117, 117, 117, 117, 117,
117, 118, 116, 116, 116, 116, 116, 116, 116, 117, 117, 118, 118,
118, 118, 118, 119, 120, 120, 119, 119, 120, 120, 121, 121, 122,
124, 124, 122, 123, 124, 123, 123, 123, 123, 123, 124, 124, 126,
126, 126, 126, 126, 125, 125, 126, 127, 126, 126, 125, 126, 126,
126, 128, 128, 128, 130, 133, 131, 133, 134, 134, 134, 136, 136,
136, 135, 135, 135, 136, 136, 136, 136, 135, 135, 135, 135, 130,
129, 129, 130, 131, 136, 138, 155, 157, 161, 170, 174, 168, 165,
169, 171, 181, 184, 182, 179, 181, 179, 175, 177, 177, 174, 170,
174, 173, 178, 173, 178, 179, 182, 184, 184, 180, 181, 182, 182,
184, 184, 188, 195, 198, 220, 255, 275, 350, 310, 315, 320, 320,
316, 300, 310, 310, 320, 317, 313, 312, 310, 297, 285, 285, 286,
288, 315, 328, 338, 344, 345, 352, 352, 342, 335, 343, 340, 342,
339, 337, 336, 336, 342, 347, 352, 352, 351, 352, 352, 351, 352,
352, 355, 375, 400, 452, 487, 476, 475, 473, 485, 500, 530, 595,
720, 720, 770, 750, 770, 750, 735, 740, 745, 735, 700, 700, 750,
760, 755, 755, 760, 760, 765, 950, 950, 950, 875, 875, 875, 880,
880, 880, 900, 900, 900, 880, 880, 890, 895, 890, 880, 870, 870,
870, 870, 870, 860, 860, 860, 860, 850, 840, 810, 820, 810, 810,
805, 810, 805, 820, 815, 820, 805, 790, 800, 780, 760, 765, 750,
740, 820, 810, 800, 800, 775, 750, 810, 750, 740, 700, 705, 660,
630, 640, 595, 590, 570, 565, 535, 440, 400, 410, 400, 405, 390,
370, 300, 300, 180, 200, 310, 290, 260, 260, 275, 260, 270, 265,
255, 250, 210, 210, 200, 195, 210, 215, 240, 240, 220, 220, 220,
220, 210, 212, 208, 220, 210, 212, 208, 220, 215, 220, 214, 214,
213, 212, 210, 210, 195, 195, 160, 160, 175, 205, 210, 208, 197,
181, 185)), .Names = c("Date", "value"), row.names = c(NA, 393L
), class = "data.frame")
The code below provides two potential options for adding year labels.
Option 1a: Faceting
You could use faceting to mark the years. For example:
library(ggplot2)
library(lubridate)
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_rect(fill=NA,colour="grey50"),
panel.spacing=unit(0,"cm"))
Note that with this approach, if there are missing dates at the beginning or end of a year (by "missing", I mean rows for those dates are not even present in the data) then the x-axis will start/end at the first/last date in the data for that year, rather than go from Jan-1 to Dec-31. In that case, you'd need to add in rows for the missing dates and either NA for value or interpolate value. In addition, with this method there is no space or line between December 31 of one year and January 1 of the next year, so there's a discontinuity across each year.
Option 1b: Faceting + centered month labels
To address #AF7's comment. You can center the month labels by adding some spaces before each label. But you have to choose the number of spaces manually, depending on the physical size of the plot when you print it to a device. (There's probably a way to center the labels programmatically based on the internal grob measurements, but I'm not sure how to do it.) I've also removed the minor vertical gridlines and lightened the line between years.
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_labels=paste(c(rep(" ",11), "%b"), collapse=""),
date_breaks="month", expand=c(0,0)) +
facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
theme_bw() +
theme(strip.placement = "outside",
strip.background = element_blank(),
panel.grid.minor.x = element_blank(),
panel.border = element_rect(colour="grey70"),
panel.spacing=unit(0,"cm"))
Option 2a: Edit the x-axis label grob
Here's a more complex and finicky method (though it could likely be automated by someone who understands the structure and unit spacings of grid graphics better than I do) that avoids the pitfalls of the faceting method described above:
library(grid)
# Fake data with an extra year added for illustration
set.seed(2)
df = data.frame(Date=seq(as.Date("1718-03-01"),as.Date("1721-09-20"), by="1 day"))
df$value = cumsum(rnorm(nrow(df)))
# The plot we'll start with
p = ggplot(df, aes(Date, value)) +
geom_vline(xintercept=as.numeric(df$Date[yday(df$Date)==1]), colour="grey60") +
geom_line() +
scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
theme_bw() +
theme(panel.grid.minor.x = element_blank()) +
labs(x="")
Now we want to add the year values below and in between June and July of each year. The code below does that by modifying the x-axis label grob and is adapted from this SO answer by #SandyMuspratt.
# Get the grob
g <- ggplotGrob(p)
# Get the y axis
index <- which(g$layout$name == "axis-b") # Which grob
xaxis <- g$grobs[[index]]
# Get the ticks (labels and marks)
ticks <- xaxis$children[[2]]
# Get the labels
ticksB <- ticks$grobs[[2]]
# Edit x-axis label grob
# Find every index of Jun in the x-axis labels and add a newline and
# then a year label
junes = which(ticksB$children[[1]]$label == "Jun")
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes],
"\n ", unique(year(df$Date)))
# Put the edited labels back into the plot
ticks$grobs[[2]] <- ticksB
xaxis$children[[2]] <- ticks
g$grobs[[index]] <- xaxis
# Draw the plot
grid.newpage()
grid.draw(g)
Option 2b: Edit the x-axis label grob and center the month labels
Below is the only change that needs to be made to Option 2a to center the month labels, but, once again, the number of spaces needs to be tweaked manually.
# Make the edit
# Center the month labels between ticks
ticksB$children[[1]]$label = paste0(paste(rep(" ",7),collapse=""), ticksB$children[[1]]$label)
# Find every index of Jun in the x-axis labels and a year label
junes = grep("Jun", ticksB$children[[1]]$label)
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes], "\n ", unique(year(df$Date)))
I came upon this question and thought maybe I can add a solution. We can display both month and year in every year's first displayed month by using a simple condition. You can play with the date_breaks to remove January from the labels, and this will still work. I'm using month() and year() from lubridate.
library(tidyverse)
library(lubridate)
df %>%
ggplot(aes(Date, value)) +
geom_line() +
scale_x_date(date_breaks = "2 months",
labels = function(x) if_else(is.na(lag(x)) | !year(lag(x)) == year(x),
paste(month(x, label = TRUE), "\n", year(x)),
paste(month(x, label = TRUE))))
If you want to try to hack together a sub-label, you could convert it to a grob. I edited this from the original post to create a function that adds the sublabels and returns a gtable object. Note that the sublabs input must be the same length as your x-axis breaks:
library(grid)
library(gtable)
library(gridExtra)
add_sublabs <- function(plot, sublabs){
gg <- ggplotGrob(plot)
axis_num <- which(gg$layout[,"name"] == "axis-b")
xbreaks <- gg[["grobs"]][[axis_num]][["children"]][[2]][["grobs"]][[2]][["children"]][[1]]$x
if(length(xbreaks) != length(sublabs)) stop("Sub-labels must be the same length as the x-axis breaks")
to_breaks <- c(as.numeric(xbreaks),1)[which(!duplicated(sublabs, fromLast = TRUE))+1]
sublabs_x <- diff(c(0,to_breaks))
sublabs_labels <- sublabs[!duplicated(sublabs, fromLast = TRUE)]
tg <- tableGrob(matrix(sublabs_labels, nrow = 1))
tg$widths = unit(sublabs_x, attr(xbreaks,"unit"))
pos <- gg$layout[axis_num,c("t","l")]
gg2 <- gtable_add_rows(gg, heights = sum(tg$heights)+unit(4,"mm"), pos = pos$t)
gg3 <- gtable_add_grob(gg2, tg, t = pos$t+1, l = pos$l)
return(gg3)
}
#Plot and sublabels
p <- ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b")) +
xlab(NULL)
sublabs <- c(rep("1719",2),rep("1720",6))
#Draw
grid.draw(add_sublabs(p, sublabs))
One way to avoid the complexities would be to change the required output so that January is replaced by the year.
The lab function returns the labels given the breaks. Unexpectedly, ggplot will pass NAs to it so in the first line of the function body we replace those with some date -- it does not matter which date since such values are not subsequently used by ggplot. Finally we format the date as a year or abbreviated month depending on whether the month is January (which corresponds to the POSIXlt component mon equalling 0) or not.
library(ggplot2)
library(scales)
lab <- function(b) {
b[is.na(b)] <- Sys.Date()
format(b, ifelse(as.POSIXlt(b)$mon == 0, "%Y", "%b"))
}
ggplot(df, aes(Date, value)) +
geom_line() +
scale_x_date(date_breaks = "month", labels = lab)
Note: I have added Issue 2182 to the ggplot2 github issues list regarding the NAs that are passed to the label function. If subsequent versions of ggplot2 no longer pass the NAs then the first line of the body of lab could be omitted .
Update: fixed.
Let's condsider the bagplot example as included in the aplpack library in R. A bagplot is a bivariate generalisation of a boxplot and therefore gives insight in the distribution of data points in both axes.
Example of a bagplot:
Code for the example:
# example of Rousseeuw et al., see R-package rpart
cardata <- structure(as.integer( c(2560,2345,1845,2260,2440,
2285, 2275, 2350, 2295, 1900, 2390, 2075, 2330, 3320, 2885,
3310, 2695, 2170, 2710, 2775, 2840, 2485, 2670, 2640, 2655,
3065, 2750, 2920, 2780, 2745, 3110, 2920, 2645, 2575, 2935,
2920, 2985, 3265, 2880, 2975, 3450, 3145, 3190, 3610, 2885,
3480, 3200, 2765, 3220, 3480, 3325, 3855, 3850, 3195, 3735,
3665, 3735, 3415, 3185, 3690, 97, 114, 81, 91, 113, 97, 97,
98, 109, 73, 97, 89, 109, 305, 153, 302, 133, 97, 125, 146,
107, 109, 121, 151, 133, 181, 141, 132, 133, 122, 181, 146,
151, 116, 135, 122, 141, 163, 151, 153, 202, 180, 182, 232,
143, 180, 180, 151, 189, 180, 231, 305, 302, 151, 202, 182,
181, 143, 146, 146)), .Dim = as.integer(c(60, 2)),
.Dimnames = list(NULL, c("Weight", "Disp.")))
bagplot(cardata,factor=3,show.baghull=TRUE,
show.loophull=TRUE,precision=1,dkmethod=2)
title("car data Chambers/Hastie 1992")
# points of y=x*x
bagplot(x=1:30,y=(1:30)^2,verbose=FALSE,dkmethod=2)
The bagplot of aplpack seems to only support plotting a "bag" for a single data series. Even more interesting would be to plot two (or three) data series within a single bagplot, where visually comparing the "bags" of the data series gives insight in the differences in the data distributions of the data series. Does anyone know if (and if so, how) this can be done in R?
If we modify some of the aplpack::bagplot code we can make a new geom for ggplot2. Then we can compare groups within a dataset in the usual ggplot2 ways. Here's one example:
library(ggplot2)
ggplot(iris, aes(Sepal.Length, Sepal.Width,
colour = Species, fill = Species)) +
geom_bag() +
theme_minimal()
and we can show the points with the bagplot:
ggplot(iris, aes(Sepal.Length, Sepal.Width,
colour = Species, fill = Species)) +
geom_bag() +
geom_point() +
theme_minimal()
Here's the code for the geom_bag and modified aplpack::bagplot function: https://gist.github.com/benmarwick/00772ccea2dd0b0f1745