I am plotting router statistics (collected from merlin speed monitoring tool).
The graphs are faceted by year-month, and I want each month's x axis to expand to the entire month, even when I only have part of a months data.
In the example below, the data for January 2022 is incomplete (just 6 hours or
so of data).
The code I have tried:
library(tidyverse)
library(scales)
X.df <- read.csv(url("https://pastebin.com/raw/sGAzEDe6")) %>%
mutate(date = as.POSIXct(date, origin="1970-01-01"))
ggplot(X.df , aes(date, Download, colour = Download)) +
geom_line()+
facet_wrap(~ month, scale="free_x", ncol = 1) +
scale_colour_gradient(low="red",high="green", limits=c(0.0, 50), oob = squish) +
scale_x_datetime(date_labels = "%d/%m", breaks = "7 day", minor_breaks = "1 day") +
coord_cartesian(ylim = c(0, 60))
Again, I want the range of the x axis in each facet to cover the entire month. Thus, I want the X axis for the 2021-12 facet to run from 1st Dec 2021 to 31st Dec 2021, and the X axis for the 2022-01 facet to run from 1st Jan 2022 to 31st Jan 2022.
Is there some way of forcing this within ggplot2?
An additional, smaller self-contained example to try your code on:
X.df <- tribble(
~date, ~month, ~Download,
"2021-12-01T00:30:36Z","2021-12",20.13,
"2021-12-07T06:30:31Z","2021-12",38.95,
"2021-12-14T08:00:31Z","2021-12",38.44,
"2021-12-21T09:30:29Z","2021-12",28.57,
"2021-12-28T16:00:31Z","2021-12",30.78,
"2021-12-31T13:00:28Z","2021-12",55.45,
"2022-01-01T00:00:28Z","2022-1",55.44,
"2022-01-01T02:30:29Z","2022-1",55.63,
"2022-01-01T03:00:29Z","2022-1",55.75,
"2022-01-01T05:00:29Z","2022-1",55.8,
"2022-01-07T03:00:29Z","2022-1",53.6,
"2022-01-07T05:00:29Z","2022-1",51.8
)
As always, thanks in advance. Pete
Updat II: Removed prior versions:
In your database there is only one january 2022 date
in the dataframe we complete the dates of januare of 2022 using complete from tidyr package.
library(tidyverse)
library(lubridate)
X.df %>%
mutate(date = ymd(date)) %>%
group_by(month(date)) %>%
complete(date = seq(min(date), max(ceiling_date(date, unit = "month") - ddays(1)), by = 'day')) %>%
fill(month) %>%
ggplot(aes(x = date, Download, colour = Download)) +
geom_line()+
facet_wrap(~ month, scale="free_x", ncol = 1) +
scale_colour_gradient(low="red",high="green", limits=c(0.0, 50), oob = squish) +
scale_x_date(date_breaks = "1 day", date_labels = "%d/%m", expand = c(0, 0)) +
coord_cartesian(ylim = c(0, 60))
Related
Similar to this question: Split up time series per year for plotting which has done in Python, I want to display the daily time series as multiple lines by year. How can I achieve this in R?
library(ggplot2)
library(dplyr)
# Dummy data
df <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10000
)
# Most basic bubble plot
p <- ggplot(df, aes(x=day, y=value)) +
geom_line() +
xlab("")
p
Out:
One solution is using ggplot2, but date_labels are displayed incorrectly:
library(tidyverse)
library(lubridate)
p <- df %>%
# mutate(date = ymd(date)) %>%
mutate(date=as.Date(date)) %>%
mutate(
year = factor(year(date)), # use year to define separate curves
date = update(date, year = 1) # use a constant year for the x-axis
) %>%
ggplot(aes(date, value, color = year)) +
scale_x_date(date_breaks = "1 month", date_labels = "%b")
# Raw daily data
p + geom_line()
Out:
Alternative solution is to use gg_season from feasts package:
library(feasts)
library(tsibble)
library(dplyr)
tsibbledata::aus_retail %>%
filter(
State == "Victoria",
Industry == "Cafes, restaurants and catering services"
) %>%
gg_season(Turnover)
Out:
References:
Split up time series per year for plotting
R - How to create a seasonal plot - Different lines for years
If you want your x axis to represent the months from January to February, then perhaps getting the yday of the date and adding it to the first of January on a random year would be simplest:
library(tidyverse)
library(lubridate)
df <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10000
)
df %>%
mutate(year = factor(year(day)), date = yday(day) + as.Date('2017-01-01')) %>%
ggplot(aes(date, value, color = year)) +
geom_line() +
scale_x_date(breaks = seq(as.Date('2017-01-01'), by = 'month', length = 12),
date_labels = '%b')
Created on 2023-02-07 with reprex v2.0.2
I tend to think simple is better:
transform(df, year = format(day, "%Y")) |>
ggplot(aes(x=day, y=value, group=year, color=year)) +
geom_line() +
xlab(NULL)
optionally removing the year legend with + guides(colour = "none").
I am trying to plot daily data with days of the week (Monday:Sunday) on the y-axis, week of the year on the x-axis with monthly labels (January:December), and facet by year with each facet as its own row. I want the week of the year to align between the facets. I also want each tile to be square.
Here is a toy dataset to work with:
my_data <- tibble(Date = seq(
as.Date("1/11/2013", "%d/%m/%Y"),
as.Date("31/12/2014", "%d/%m/%Y"),
"days"),
Value = runif(length(VectorofDates)))
One solution I came up with is to use lubridate::week() to number the weeks and plot by week. This correctly aligns the x-axis between the facets. The problem is, I can't figure out how to label the x-axis with monthly labels.
my_data %>%
mutate(Week = week(Date)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
coord_fixed()
Alternatively, I tried plotting by the first day of the week using lubridate::floor_date and lubridate::round_date. In this solution, the x-axis is correctly labeled, but the weeks don't align between the two years. Also, the tiles aren't perfectly square, though I think this could be fixed by playing around with the coord_fixed ratio.
my_data %>%
mutate(Week = floor_date(Date, week_start = 1),
Week = round_date(Week, "week", week_start = 1)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
scale_x_datetime(name = NULL, labels = label_date("%b")) +
coord_fixed(7e5)
Any suggestions of how to get the columns to align correctly by week of the year while labeling the months correctly?
The concept is a little flawed, since the same week of the year is not guaranteed to fall in the same month. However, you can get a "close enough" solution by using the labels and breaks argument of scale_x_continuous. The idea here is to write a function which takes a number of weeks, adds 7 times this value as a number of days onto an arbitrary year's 1st January, then format it as month name only using strftime:
my_data %>%
mutate(Week = week(Date)) %>%
mutate(Weekday = wday(Date, label = TRUE, week_start = 1)) %>%
mutate(Year = year(Date)) %>%
ggplot(aes(fill = Value, x = Week, y = Weekday)) +
geom_tile() +
theme_bw() +
facet_grid(Year ~ .) +
coord_fixed() +
scale_x_continuous(labels = function(x) {
strftime(as.Date("2000-01-01") + 7 * x, "%B")
}, breaks = seq(1, 52, 4.2))
Another option if you're sick of reinventing the wheel is to use the calendarHeat function in the Github-only makeR package:
install_github("jbryer/makeR")
library(makeR)
calendarHeat(my_data$Date, my_data$Value)
how can I make one graph from different dates? For example I have data from 2019 and 2020 and would like to display the results in one chart only for months. How can I limit data from a given time period? I want to have one line for 2019 year and the second line for 2020 year.
Date
Microsoft Teams
2019-01-06
3
2019-03-10
10
2019-06-09
15
2019-12-29
10
2020-01-06
25
2020-03-10
35
2020-06-09
43
2020-12-29
39
On this graph I want to make another line for year 2020. For this I use this command:
ggplot() + geom_line(data=trendy, aes(x=date, y=`Microsoft Teams`), color="blue")
+ labs(title="Popularność wyszukiwania hasła Microsoft Teams", x="Data", y="Popularność", caption = "")
+ scale_x_date(date_labels = "%B", limit=c(as.Date("2019-01-01"),as.Date("2019-12-31")))
Can someone help me if it's possible?
I am not sure which one do you prefer, but here are two options for you.
manipulate data
trendy <- data %>%
mutate(Date = as.Date(Date),
year = year(Date),
date = paste('2000', month(Date), day(Date), sep = '-'),
date = as.Date(date))
plot 1
ggplot(data=trendy, aes(x=Date, y=`Microsoft Teams`, color = year)) +
geom_line() +
labs(title="Popularność wyszukiwania hasła Microsoft Teams", x="Data", y="Popularność", caption = "") +
scale_x_date(date_labels = "%B") +
theme_bw()
plot 2
ggplot(data=trendy, aes(x=date, y=`Microsoft Teams`, color = factor(year))) +
geom_line() +
labs(title="Popularność wyszukiwania hasła Microsoft Teams", x="Data", y="Popularność", caption = "") +
scale_x_date(date_labels = "%B") +
theme_bw()
library(tidyverse)
library(lubridate)
Preparing the data:
dat <- tribble(~Date, ~Teams,
"2019-01-06", 3,
"2019-03-10", 10,
"2019-06-09", 15,
"2019-12-29", 10,
"2020-01-06", 25,
"2020-03-10", 35,
"2020-06-09", 43,
"2020-12-29", 39)
dat <- mutate(dat, Date = parse_date(Date))
The trick is to separate the dates into years and months, and then map years as the colour dimension in the chart:
dat %>%
mutate(years = as.character(year(Date)), months = month(Date, label = TRUE)) %>%
ggplot(aes(x = months, y = Teams, colour = years, group = years)) +
geom_line()
Use lubridate package ymd to extract year and month from date with month, year and
Make both factor with as.factor
Then plot with ggplot
library(tidyverse)
library(lubridate)
df1 <- df %>%
mutate(year = as.factor(year(ymd(Date))),
month = as.factor(month(Date))
)
ggplot(df1, aes(x = month, y = Microsoft.Teams, colour = year, group=year)) +
geom_point()+
geom_line()
When I use date_labels = “%b %y” within scale_x_date then the tick labels are rather cluttered because the year appears with each month. (I specifically want to label every month.) I would rather have the year appear only at the start and end of the date range, and also at December and January.
My minimal representative example follows.
I was hoping to use a function that creates the tick labels using date_labels = “%b %y” for December, January, first the month and the last month, and then to use date_labels = “%b” for all other months.
As a first stab, I tried to reproduce my existing (cluttered) tick labels with a function (by switching to the commented line), but was not able to do so.
To be specific, for this example I would like tick labels to be
Aug 20, Sep, Oct, Nov, Dec 20, Jan 21, Feb, Mar, Apr, May 21
Thank you for any suggestions.
start_date <- as.Date('2020-08-08')
end_date <- as.Date('2021-05-10')
# tibble with test data
mytib <- tibble( dates = as.Date(as.character(seq(start_date, end_date, by = 4) ) ),
yval = as.numeric(strftime(dates, format = "%d")),
# following two lines just to give some color to the plot
month = strftime(dates, format = "%m") ) %>%
group_by( month )
gd <-
ggplot(mytib, aes( x=dates, y=yval, color = month ) ) +
geom_point() +
geom_line( aes( x=dates, y=yval, color=month, group=month ) ) +
theme(legend.position = c("none")) +
scale_x_date(date_breaks = "1 month",
date_minor_breaks = "1 week",
date_labels = "%b %y" ) +
#labels = function(x) as.Date(x,format = "%b %y" ) ) +
labs (x=NULL, y=NULL ) +
geom_blank()
gd
I am not sure if it is possible to provide such customised labels with scale_x_date. You can create them with dplyr and use scale_x_discrete.
library(dplyr)
library(ggplot2)
library(lubridate)
mytib %>%
arrange(dates) %>%
mutate(month = month(dates),
year = year(dates)) %>%
group_by(year, month) %>%
mutate(labels = ifelse(row_number() == 1, format(dates, '%b'), '')) %>%
group_by(year) %>%
mutate(labels = ifelse(!duplicated(month) & month %in% range(month),
format(dates, '%b %y'), labels)) %>%
ungroup %>%
mutate(dates = factor(dates)) -> data
ggplot(data, aes( x=dates, y=yval, color = month)) +
geom_point() +
geom_line( aes( x=dates, y=yval, color=month, group=month ) ) +
theme(legend.position = c("none")) +
scale_x_discrete(labels = data$labels)
labs (x=NULL, y=NULL ) +
geom_blank()
Building on this question and the use of "water year" in R I have question regarding plotting in ggplot2 with a common date axis over many years. A water year is definitely the start of the year to be October 1st ending September 30. It is a little more sensible for the hydrological cycle.
So say I have this data set:
library(dplyr)
library(ggplot2)
library(lubridate)
df <- data.frame(Date=seq.Date(as.Date("1910/1/1"), as.Date("1915/1/1"), "days"),
y=rnorm(1827,100,1))
Then here is the wtr_yr function:
wtr_yr <- function(dates, start_month=10) {
# Convert dates into POSIXlt
dates.posix = as.POSIXlt(dates)
# Year offset
offset = ifelse(dates.posix$mon >= start_month - 1, 1, 0)
# Water year
adj.year = dates.posix$year + 1900 + offset
# Return the water year
adj.year
}
What I would like to do is use colour as a grouping variable, then make a x axes that only consists of month and date information. Usually I've done like so (using the lubridate package):
ymd(paste0("1900","-",month(df$Date),"-",day(df$Date)))
This works fine if year is arranged normally. However in this water year scenario, the real year span the water year. So ideally I'd like a plot that goes from October 1 to September 30 and plot separate lines for each water year maintaining all the correct water years. Here is where I am so far:
df1 <- df %>%
mutate(wtr_yrVAR=factor(wtr_yr(Date))) %>%
mutate(CDate=as.Date(paste0("1900","-",month(Date),"-",day(Date))))
df1 <- %>%
ggplot(aes(x=CDate, y=y, colour=wtr_yrVAR)) +
geom_point()
So plotting that obviously date spans from Jan to Dec. Any ideas how I can force ggplot2 to plot these along the water year lines?
Here is a method that works:
df3 <- df %>%
mutate(wtr_yrVAR=factor(wtr_yr(Date))) %>%
#seq along dates starting with the beginning of your water year
mutate(CDate=as.Date(paste0(ifelse(month(Date) < 10, "1901", "1900"),
"-", month(Date), "-", day(Date))))
Then:
df3 %>%
ggplot(., aes(x = CDate, y = y, colour = wtr_yrVAR)) +
geom_point() +
scale_x_date(date_labels = "%b %d")
Which gives:
not very elegant but this should work:
df1 <- df %>%
mutate(wtr_yrVAR=factor(wtr_yr(Date))) %>%
mutate(CDdate= as.Date(as.numeric(Date - as.Date(paste0(wtr_yrVAR,"-10-01"))), origin = "1900-10-01"))
df1 %>% ggplot(aes(x =CDdate, y=y, colour=wtr_yrVAR)) +
geom_line() + theme_bw()+scale_x_date(date_breaks = "1 month", date_labels = "%b", limits = c(as.Date("1899-09-30"),as.Date("1900-10-01")))+theme_bw()