align labels with calendar quarters in ggplot - r

Consider this simple example
tibble(date = c(ymd('2021-01-01'),
ymd('2021-04-01')),
value = c(10, 20)) %>%
ggplot(aes(x = date, y = value)) + geom_col()
As you can see, the bar are centered at the date. Instead, I would like the first bar to span January to March (first quarter) and April to June (second quarter).
How can I do that?
Thanks!

Perhaps, we can adjust the scale_x_date
library(dplyr)
library(ggplot2)
library(lubridate)
library(stringr)
start <- seq(min(tbl1$date), max(tbl1$date), by = '3 months')
end <- start %m+% months(2)
start_end <- str_c(format(start, '%b %Y'), format(end, '%b %Y'), sep='--')
tbl1 %>%
ggplot(aes(x = date %m+% days(40), y = value)) +
geom_col() +
scale_x_date(breaks = start, labels = start_end) +
xlab("date")
-output
data
tbl1 <- tibble(date = c(ymd('2021-01-01'),
ymd('2021-04-01')),
value = c(10, 20))

Related

Plotting one daily time serie per year in R (ggplot2)

Similar to this question: Split up time series per year for plotting which has done in Python, I want to display the daily time series as multiple lines by year. How can I achieve this in R?
library(ggplot2)
library(dplyr)
# Dummy data
df <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10000
)
# Most basic bubble plot
p <- ggplot(df, aes(x=day, y=value)) +
geom_line() +
xlab("")
p
Out:
One solution is using ggplot2, but date_labels are displayed incorrectly:
library(tidyverse)
library(lubridate)
p <- df %>%
# mutate(date = ymd(date)) %>%
mutate(date=as.Date(date)) %>%
mutate(
year = factor(year(date)), # use year to define separate curves
date = update(date, year = 1) # use a constant year for the x-axis
) %>%
ggplot(aes(date, value, color = year)) +
scale_x_date(date_breaks = "1 month", date_labels = "%b")
# Raw daily data
p + geom_line()
Out:
Alternative solution is to use gg_season from feasts package:
library(feasts)
library(tsibble)
library(dplyr)
tsibbledata::aus_retail %>%
filter(
State == "Victoria",
Industry == "Cafes, restaurants and catering services"
) %>%
gg_season(Turnover)
Out:
References:
Split up time series per year for plotting
R - How to create a seasonal plot - Different lines for years
If you want your x axis to represent the months from January to February, then perhaps getting the yday of the date and adding it to the first of January on a random year would be simplest:
library(tidyverse)
library(lubridate)
df <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10000
)
df %>%
mutate(year = factor(year(day)), date = yday(day) + as.Date('2017-01-01')) %>%
ggplot(aes(date, value, color = year)) +
geom_line() +
scale_x_date(breaks = seq(as.Date('2017-01-01'), by = 'month', length = 12),
date_labels = '%b')
Created on 2023-02-07 with reprex v2.0.2
I tend to think simple is better:
transform(df, year = format(day, "%Y")) |>
ggplot(aes(x=day, y=value, group=year, color=year)) +
geom_line() +
xlab(NULL)
optionally removing the year legend with + guides(colour = "none").

ggplot2: Facetting by year and aligning x-axis dates by month

I am trying to plot daily data with days of the week (Monday:Sunday) on the y-axis, week of the year on the x-axis with monthly labels (January:December), and facet by year with each facet as its own row. I want the week of the year to align between the facets. I also want each tile to be square.
Here is a toy dataset to work with:
my_data <- tibble(Date = seq(
as.Date("1/11/2013", "%d/%m/%Y"),
as.Date("31/12/2014", "%d/%m/%Y"),
"days"),
Value = runif(length(VectorofDates)))
One solution I came up with is to use lubridate::week() to number the weeks and plot by week. This correctly aligns the x-axis between the facets. The problem is, I can't figure out how to label the x-axis with monthly labels.
my_data %>%
mutate(Week = week(Date)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
coord_fixed()
Alternatively, I tried plotting by the first day of the week using lubridate::floor_date and lubridate::round_date. In this solution, the x-axis is correctly labeled, but the weeks don't align between the two years. Also, the tiles aren't perfectly square, though I think this could be fixed by playing around with the coord_fixed ratio.
my_data %>%
mutate(Week = floor_date(Date, week_start = 1),
Week = round_date(Week, "week", week_start = 1)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
scale_x_datetime(name = NULL, labels = label_date("%b")) +
coord_fixed(7e5)
Any suggestions of how to get the columns to align correctly by week of the year while labeling the months correctly?
The concept is a little flawed, since the same week of the year is not guaranteed to fall in the same month. However, you can get a "close enough" solution by using the labels and breaks argument of scale_x_continuous. The idea here is to write a function which takes a number of weeks, adds 7 times this value as a number of days onto an arbitrary year's 1st January, then format it as month name only using strftime:
my_data %>%
mutate(Week = week(Date)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
coord_fixed() +
scale_x_continuous(labels = function(x) {
strftime(as.Date("2000-01-01") + 7 * x, "%B")
}, breaks = seq(1, 52, 4.2))
Another option if you're sick of reinventing the wheel is to use the calendarHeat function in the Github-only makeR package:
install_github("jbryer/makeR")
library(makeR)
calendarHeat(my_data$Date, my_data$Value)

Expand axis dates to a full month in each facet

I am plotting router statistics (collected from merlin speed monitoring tool).
The graphs are faceted by year-month, and I want each month's x axis to expand to the entire month, even when I only have part of a months data.
In the example below, the data for January 2022 is incomplete (just 6 hours or
so of data).
The code I have tried:
library(tidyverse)
library(scales)
X.df <- read.csv(url("https://pastebin.com/raw/sGAzEDe6")) %>%
mutate(date = as.POSIXct(date, origin="1970-01-01"))
ggplot(X.df , aes(date, Download, colour = Download)) +
geom_line()+
facet_wrap(~ month, scale="free_x", ncol = 1) +
scale_colour_gradient(low="red",high="green", limits=c(0.0, 50), oob = squish) +
scale_x_datetime(date_labels = "%d/%m", breaks = "7 day", minor_breaks = "1 day") +
coord_cartesian(ylim = c(0, 60))
Again, I want the range of the x axis in each facet to cover the entire month. Thus, I want the X axis for the 2021-12 facet to run from 1st Dec 2021 to 31st Dec 2021, and the X axis for the 2022-01 facet to run from 1st Jan 2022 to 31st Jan 2022.
Is there some way of forcing this within ggplot2?
An additional, smaller self-contained example to try your code on:
X.df <- tribble(
~date, ~month, ~Download,
"2021-12-01T00:30:36Z","2021-12",20.13,
"2021-12-07T06:30:31Z","2021-12",38.95,
"2021-12-14T08:00:31Z","2021-12",38.44,
"2021-12-21T09:30:29Z","2021-12",28.57,
"2021-12-28T16:00:31Z","2021-12",30.78,
"2021-12-31T13:00:28Z","2021-12",55.45,
"2022-01-01T00:00:28Z","2022-1",55.44,
"2022-01-01T02:30:29Z","2022-1",55.63,
"2022-01-01T03:00:29Z","2022-1",55.75,
"2022-01-01T05:00:29Z","2022-1",55.8,
"2022-01-07T03:00:29Z","2022-1",53.6,
"2022-01-07T05:00:29Z","2022-1",51.8
)
As always, thanks in advance. Pete
Updat II: Removed prior versions:
In your database there is only one january 2022 date
in the dataframe we complete the dates of januare of 2022 using complete from tidyr package.
library(tidyverse)
library(lubridate)
X.df %>%
mutate(date = ymd(date)) %>%
group_by(month(date)) %>%
complete(date = seq(min(date), max(ceiling_date(date, unit = "month") - ddays(1)), by = 'day')) %>%
fill(month) %>%
ggplot(aes(x = date, Download, colour = Download)) +
geom_line()+
facet_wrap(~ month, scale="free_x", ncol = 1) +
scale_colour_gradient(low="red",high="green", limits=c(0.0, 50), oob = squish) +
scale_x_date(date_breaks = "1 day", date_labels = "%d/%m", expand = c(0, 0)) +
coord_cartesian(ylim = c(0, 60))

How to create custom date labels using ggplot with scale_x_date

When I use date_labels = “%b %y” within scale_x_date then the tick labels are rather cluttered because the year appears with each month. (I specifically want to label every month.) I would rather have the year appear only at the start and end of the date range, and also at December and January.
My minimal representative example follows.
I was hoping to use a function that creates the tick labels using date_labels = “%b %y” for December, January, first the month and the last month, and then to use date_labels = “%b” for all other months.
As a first stab, I tried to reproduce my existing (cluttered) tick labels with a function (by switching to the commented line), but was not able to do so.
To be specific, for this example I would like tick labels to be
Aug 20, Sep, Oct, Nov, Dec 20, Jan 21, Feb, Mar, Apr, May 21
Thank you for any suggestions.
start_date <- as.Date('2020-08-08')
end_date <- as.Date('2021-05-10')
# tibble with test data
mytib <- tibble( dates = as.Date(as.character(seq(start_date, end_date, by = 4) ) ),
yval = as.numeric(strftime(dates, format = "%d")),
# following two lines just to give some color to the plot
month = strftime(dates, format = "%m") ) %>%
group_by( month )
gd <-
ggplot(mytib, aes( x=dates, y=yval, color = month ) ) +
geom_point() +
geom_line( aes( x=dates, y=yval, color=month, group=month ) ) +
theme(legend.position = c("none")) +
scale_x_date(date_breaks = "1 month",
date_minor_breaks = "1 week",
date_labels = "%b %y" ) +
#labels = function(x) as.Date(x,format = "%b %y" ) ) +
labs (x=NULL, y=NULL ) +
geom_blank()
gd
I am not sure if it is possible to provide such customised labels with scale_x_date. You can create them with dplyr and use scale_x_discrete.
library(dplyr)
library(ggplot2)
library(lubridate)
mytib %>%
arrange(dates) %>%
mutate(month = month(dates),
year = year(dates)) %>%
group_by(year, month) %>%
mutate(labels = ifelse(row_number() == 1, format(dates, '%b'), '')) %>%
group_by(year) %>%
mutate(labels = ifelse(!duplicated(month) & month %in% range(month),
format(dates, '%b %y'), labels)) %>%
ungroup %>%
mutate(dates = factor(dates)) -> data
ggplot(data, aes( x=dates, y=yval, color = month)) +
geom_point() +
geom_line( aes( x=dates, y=yval, color=month, group=month ) ) +
theme(legend.position = c("none")) +
scale_x_discrete(labels = data$labels)
labs (x=NULL, y=NULL ) +
geom_blank()

Why can't I get the right horizontal axis labels on my ggplot2 chart?

I am trying to do a faceted plot of a grouped dataframe with ggplot2, using geom_line(). My dataframe has a Date column and I would like to have dates on the horizontal axis. If I just use Date in aes(x=Date, ...) I get nice labels on the horizontal axis. However, the line has an almost horizontal section where the date jumps from the end of one group to the beginning of the next group. This code and chart shows that:
dts <- seq.Date(as.Date("2020-01-01"), as.Date("2021-12-31"), by="day")
mos <- sapply(dts, month)
df <- data.frame(Date=dts, Month=mos)
nr <- nrow(df)
df$X <- rep(1, nr)
df %>%
group_by(Month) -> dfgrp
dfgrp %>%
group_by(Month) %>%
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=Date, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))
I would not like my chart to have those almost-horizontal lines when the date changes by a large amount. I was able to generate a chart without those lines using integers on aes() as follows:
dfgrp %>%
mutate(Time = 1:n() %>% as.integer(),
Z = cumsum(X)) %>%
ggplot(aes(x=Time, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
scale_x_continuous(breaks = seq(from=1, to=nr, by=10) %>% as.integer(),
labels = function(x) as.character(dfgrp$Date[x])) +
theme(axis.text.x = element_text(angle=45, size=7))
The line on the chart looks like I want it but the dates on the horizontal axis are not correct: they end in February 2020 in every facet while the dates in the dataframe end in December 2021 and the dates in the first chart begin and end on different months in different facets.
I tried many things but nothing worked. Any suggestions on how to have a chart with dates like in the first chart above and lines like in the second chart above?
Help will be much appreciated.
You may want to adjust the dates to be in the same year, but noting the original year as a variable:
library(lubridate)
dfgrp %>%
group_by(Month) %>%
mutate(year = year(Date),
adj_date = ymd(paste(2020, month(Date), day(Date)))) %>%
# 2020 was leap year so 2/29 won't be lost
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=adj_date, y=Z, color = year, group = year)) +
geom_line(size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))

Resources