How to format difftime as hh:mm in ggplot2? - r

I want to display difftime data with ggplot2 and I want the tick format to be hh:mm.
library(ggplot2)
a= as.difftime(c("0:01", "4:00"), "%H:%M", unit="mins")
b= as.difftime(c('0:01', "2:47"), "%H:%M", unit="mins")
ggplot(data=NULL, aes(x=b, y=a)) + geom_point(shape=1) +
scale_x_time(labels = date_format("%H:%M"),
breaks = "1 hour")
But I get the following warning:
Don't know how to automatically pick scale for object of type difftime. Defaulting to continuous.
Warning message:
In structure(as.numeric(x), names = names(x)) : NAs introduced by coercion
and this as a graph:
Update:
my example was too minimal, I also need to be able to display negative differences, so this would be better data:
a= as.difftime(c(-60, -4*60), unit="mins")
b= as.difftime(c(-60, 2*60+47), unit="mins")
ggplot(data=NULL, aes(x=b, y=a)) + geom_point(shape=1)

The answer has two parts.
Plotting difftime objects
According to help("scale_x_time"), ggplot2 supports three date/time classes: scale_*_date for dates (class Date), scale_*_datetime for datetimes (class POSIXct), and scale_*_time for times (class hms). The last one is what we need here.
Class hms is a custom class for difftime vectors. as.hms() has a method for difftime. So. difftime objects can be plotted with ggplot2 by coercing to class hms:
a <- as.difftime(c(-60, -4 * 60), unit = "mins")
b <- as.difftime(c(-60, 2 * 60 + 47), unit = "mins")
library(ggplot2)
ggplot(data = NULL, aes(x = hms::as.hms(b), y = hms::as.hms(a))) +
geom_point(shape = 1)
Please, note that negative time differences are shown as well.
Formatting the tick labels
The OP has requested that tick marks should be labeled in hh:mm format. Apparently, the default formatting is hh:mm:ss. This can be modified by specifying a function that takes the breaks as input and returns labels as output to the labels parameter of the scale_x_time() and scale_y_time() functions:
format_hm <- function(sec) stringr::str_sub(format(sec), end = -4L)
ggplot(data = NULL, aes(x = hms::as.hms(b), y = hms::as.hms(a))) +
geom_point(shape = 1) +
scale_x_time(name = "b", labels = format_hm) +
scale_y_time(name = "a", labels = format_hm)
The format_hm() function truncates the :ss part from the default format. In addition, the axis are labeled nicely.

Depending on your constraints, you might consider translating the difftimes to distinct datetimes, which ggplot can handle just fine:
library(lubridate)
a_date_times <- floor_date(Sys.time(), "1 day") + a
b_date_times <- floor_date(Sys.time(), "1 day") + b
ggplot(data=NULL, aes(x=a_date_times, y=b_date_times)) +
geom_point(shape=1)

My best approach so far is:
library(ggplot2)
library(lubridate)
a= as.difftime(c(-60, -4*60), unit="mins")
b= as.difftime(c(-60, 2*60+47), unit="mins")
xbreaks = seq(ceiling(min(b)/60), floor(max(b)/60)) * 60
ybreaks = seq(ceiling(min(a)/60), floor(max(a)/60)) * 60
ggplot(data=NULL, aes(x=b, y=a)) + geom_point(shape=1) +
scale_x_continuous(labels = f, breaks = xbreaks) +
scale_y_continuous(labels = f, breaks = ybreaks)
f <- function(x){
t = seconds_to_period(abs(x)*60)
r = sprintf("% 2i:%02i", sign(x)*hour(t), minute(t))
return(r)
}

Related

ggplot2 comparation of time period

I need to visualize and compare the difference in two equally long sales periods. 2018/2019 and 2019/2020. Both periods begin at week 44 and end at week 36 of the following year. If I create a graph, both periods are continuous and line up. If I use only the week number, the values ​​are sorted as continuum and the graph does not make sense. Can you think of a solution?
Thank You
Data:
set.seed(1)
df1 <- data.frame(sells = runif(44),
week = c(44:52,1:35),
YW = yearweek(seq(as.Date("2018-11-01"), as.Date("2019-08-31"), by = "1 week")),
period = "18/19")
df2 <- data.frame(sells = runif(44),
week = c(44:52,1:35),
YW = yearweek(seq(as.Date("2019-11-01"), as.Date("2020-08-31"), by = "1 week")),
period = "19/20")
# Yearweek on x axis, when both period are separated
ggplot(df1, aes(YW, sells)) +
geom_line(aes(color="Period 18/19")) +
geom_line(data=df2, aes(color="Period 19/20")) +
labs(color="Legend text")
# week on x axis when weeks are like continuum and not splited by year
ggplot(df1, aes(week, sells)) +
geom_line(aes(color="Period 18/19")) +
geom_line(data=df2, aes(color="Period 19/20")) +
labs(color="Legend text")
Another alternative is to facet it. This'll require combining the two sets into one, preserving the data source. (This is commonly a better way of dealing with it in general, anyway.)
(I don't have tstibble, so my YW just has seq(...), no yearweek. It should translate.)
ggplot(dplyr::bind_rows(tibble::lst(df1, df2), .id = "id"), aes(YW, sells)) +
geom_line(aes(color = id)) +
facet_wrap(id ~ ., scales = "free_x", ncol = 1)
In place of dplyr::bind_rows, one might also use data.table::rbindlist(..., idcol="id"), or do.call(rbind, ...), though with the latter you will need to assign id externally.
One more note: the default formatting of the x-axis is obscuring the "year" of the data. If this is relevant/important (and not apparent elsewhere), then use ggplot2's normal mechanism for forcing labels, e.g.,
... +
scale_x_date(labels = function(z) format(z, "%Y-%m"))
While unlikely that you can do this without having tibble::lst available, you can replace that with list(df1=df1, df2=df2) or similar.
If you want to keep the x axis as a numeric scale, you can do:
ggplot(df1, aes((week + 9) %% 52, sells)) +
geom_line(aes(color="Period 18/19")) +
geom_line(data=df2, aes(color="Period 19/20")) +
scale_x_continuous(breaks = 1:52,
labels = function(x) ifelse(x == 9, 52, (x - 9) %% 52),
name = "week") +
labs(color="Legend text")
Try this. You can format your week variable as a factor and keep the desired order. Here the code:
library(ggplot2)
library(tsibble)
#Data
df1$week <- factor(df1$week,levels = unique(df1$week),ordered = T)
df2$week <- factor(df2$week,levels = unique(df2$week),ordered = T)
#Plot
ggplot(df1, aes(week, sells)) +
geom_line(aes(color="Period 18/19",group=1)) +
geom_line(data=df2, aes(color="Period 19/20",group=1)) +
labs(color="Legend text")
Output:

Timestamp on x-axis in timeseries ggplot

I have measurement data from the past months:
Variables
x <- df$DatoTid
y <- df$Partikler
color <- df$Opgave
I'm trying to plot my data based on the timestamp, so that I have the hours of the day in the x-axis, instead of the specific POSIXct datetime.
I would like the labels and ticks of the x-axis to be fx "00:00", "01:00",..."24:00".
So that noon is in the middle of the x-axis.
So far I tried to convert the datetime values into characters.
Doesn't look good yet (as you can see the axis ticks and labels are gone. Possibly other things are wrong as well).
Can someone help me?
And please let me know how to upload the data for you. I don't know how to add a huge .csv-file....
# Rounding up to nearest 10 min:
head(df)
df$Tid2 <- format(strptime("1970-01-01", "%Y-%m-%d", tz="CET") +
round(as.numeric(df$DatoTid)/300)*300 + 3600, "%Y-%m-%d %H:%M:%S")
head(df)
df$Tid2 <- as.character(df$Tid2)
str(df)
x <- df$Tid2
y <- df$Partikler
color <- df$Opgave
plot2 <- ggplot(data = df, aes(x = x, y = y, color = color)) +
geom_point(shape=16, alpha=0.6, size=1.8) +
scale_y_continuous(labels=function(x) format(x, big.mark = ".", decimal.mark = ",", scientific = FALSE)) +
scale_x_discrete(breaks=c("00:00:00", "06:00:00", "09:00:00", "12:00:00", "18:00:00", "21:00:00")) +
scale_color_discrete(name = "Case") +
xlab(" ") +
ylab(expression(paste("Partikelkoncentration [pt/cc]"))) +
myTheme +
theme(legend.text=element_text(size=8), legend.title=element_text(size=8))
plot2
I would approach this by making a new time stamp that uses a single day, but the hours/minutes/seconds of your existing time stamp.
First, here's a made-up version of your data, here using a linear trend in Partikler:
library(tidyverse); library(lubridate)
df <- data_frame(Tid2 = seq.POSIXt(from = ymd_h(2019010100),
to = ymd_h(2019011500), by = 60*60),
Partikler = seq(from = 0, to = 2.5E5, along.with = Tid2),
Opgave = as.factor(floor_date(Tid2, "3 days")))
# Here's a plot that's structurally similar to yours:
ggplot(df, aes(Tid2, Partikler, col = Opgave)) +
geom_point() +
scale_color_discrete(name = "Case")
Now, if we change the timestamps to be in the same day, we can control them like usual in ggplot, but with them collapsed into a single day of timing. We can also change the x axis so it doesn't mention the date component of the time stamp:
df2 <- df %>%
mutate(Tid2_sameday = ymd_hms(paste(Sys.Date(),
hour(Tid2), minute(Tid2), second(Tid2))))
ggplot(df2, aes(Tid2_sameday, Partikler, col = Opgave)) +
geom_point() +
scale_color_discrete(name = "Case") +
scale_x_datetime(date_labels = "%H:%M")

Setting limits with scale_x_datetime and time data

I want to set bounds for the x-axis for a plot of time-series data which features only time (no dates). My limits are:
lims <- strptime(c("03:00","16:00"), format = "%H:%M")
And my ggplot prints fine, but when I add this to scale_x_datetime
scale_x_datetime(limits = lims)
I get Error: Invalid input: time_trans works with objects of class POSIXct only
Fully reproducible example courtesy of How to create a time scatterplot with R?
dates <- as.POSIXct(as.Date("2011/01/01") + sample(0:365, 100, replace=TRUE))
times <- as.POSIXct(runif(100, 0, 24*60*60), origin="2011/01/01")
df <- data.frame(
dates = dates,
times = times
)
lims <- strptime(c("04:00","16:00"), format = "%H:%M")
library(scales)
library(ggplot2)
ggplot(df, aes(x=dates, y=times)) +
geom_point() +
scale_y_datetime(limits = lims, breaks=date_breaks("4 hour"), labels=date_format("%H:%M")) +
theme(axis.text.x=element_text(angle=90))
the error message says that you should use as.POSIXct on lims.
You also need to add the date (year, month and day) in lims, because by default it will be `2015, which is off limits.
lims <- as.POSIXct(strptime(c("2011-01-01 03:00","2011-01-01 16:00"), format = "%Y-%m-%d %H:%M"))
ggplot(df, aes(x=dates, y=times)) +
geom_point() +
scale_y_datetime(limits =lims, breaks=date_breaks("4 hour"), labels=date_format("%H:%M"))+
theme(axis.text.x=element_text(angle=90))

ggplot2: minor breaks in scale_x_datetime

This code:
library(lubridate)
library(ggplot2)
library(scales)
.months <- 3
.minor.intervals <- 4
.minor.intervals.num <- .months * .minor.intervals
sdate <- as.POSIXct("2015-01-01")
edate <- sdate + months(.months)
df <- data.frame(x = seq(from = sdate, to = edate,
length.out = .minor.intervals.num * 2),
y = 1:(.minor.intervals.num * 2))
p <- ggplot(df, aes(x = x, y = y))
xbm <- seq(from = sdate, to = edate, length.out = .minor.intervals.num)
p <- p + scale_x_datetime(limits = c(sdate, edate),
breaks = date_breaks("month"),
minor_breaks = xbm)
p <- p + geom_line() + geom_point()
plot(p)
gives me error: Error in Ops.POSIXt((x - from[1]), diff(from)) : '/' not defined for "POSIXt" objects
If I comment minor_breaks part β€” everything works.
If I change minor_breaks part to minor_breaks = date_breaks("week") β€” everything works too.
But I want to split month exactly for 4 parts...
How to fix it?
I have found a way to solve the problem, but I must admit that I am not sure why it has to be done this way. It seems that minor_breaks expects numeric values and not dates as input.
I created the breaks with the following code:
maj.breaks <- sdate + months(0:.months)
min.breaks <- do.call(c,
lapply(1:.months,function(m) {
seq(maj.breaks[m],maj.breaks[m+1],length.out = .minor.intervals+1)
})
)
which relies on the variables as defined in your example. Note the difference to your way of defining the minor breaks: since each month has different length, it is not enough to simply split the range between the start and end dates into the appropriate number of segments. You have to split each month by itself.
As mentioned above, you then have to convert min.breaks to numeric before you pass it to minor_breaks. I produce the plot as follows:
p <- ggplot(df, aes(x = x, y = y)) +
scale_x_datetime(limits = c(sdate, edate),
breaks = maj.breaks,
minor_breaks = as.numeric(min.breaks)) +
geom_line() + geom_point()
plot(p)
This is identical to your code up to the inputs for breaks and minor_breaks. There is no need to use the vector maj.breaks since your version works just as well. But I think it is interesting to note that breaks works with input of class POSIXct, while minor_breaks expects numeric values. Unfortunately, I don't know the reason for this.

Trouble with placing and formatting dates in ggplot2 graph using chron

I've been trying to add appropriate dates on the x-axis of my graph, but can't figure out how to do it in a sane way. What I want is pretty simple: a date at every January 1st in between the minimum and maximum of my data set.
I don't want to include the month - just '2008' or '2009' or whatever is fine. A great example would be this graph:
example graph
Except I want the date on every year, rather than every other year.
I can't seem to figure this out. My dates are defined as days since 1/1/1970, and I've included a method dateEPOCH_formatter which converts the epoch format to a format using the chron package. I've figured out how to make a tick mark and date at the origin of the graph and every 365 days thereafter, but that's not quite the same thing.
Another minor problem is that, mysteriously, the line chron(floor(y), out.format="mon year",origin.=epoch) outputs a graph with axis markers like 'Mar 2008', but changing the line to chron(floor(y), out.format="year",origin.=epoch) doesn't give me a result like '2008' - it just results in the error:
Error in parse.format(format[1]) : unrecognized format year
Calls: print ... as.character.times -> format -> format.dates -> parse.format
Execution halted
Here's my code - thanks for the help.
library(ggplot2)
library(chron)
argv <- commandArgs(trailingOnly = TRUE)
mydata = read.csv(argv[1])
png(argv[2], height=300, width=470)
timeHMS_formatter <- function(x) { # Takes time in seconds from midnight, converts to HH:MM:SS
h <- floor(x/3600)
m <- floor(x %% 60)
s <- round(60*(x %% 1)) # Round to nearest second
lab <- sprintf('%02d:%02d', h, m, s) # Format the strings as HH:MM:SS
lab <- gsub('^00:', '', lab) # Remove leading 00: if present
lab <- gsub('^0', '', lab) # Remove leading 0 if present
}
dateEPOCH_formatter <- function (y){
epoch <- c(month=1,day=1,year=1970)
chron(floor(y), out.format="mon year",origin.=epoch)
}
p= ggplot() +
coord_cartesian(xlim=c(min(mydata$day),max(mydata$day)), ylim=c(0,86400)) + # displays data from first email through present
scale_color_hue() +
xlab("Date") +
ylab("Time of Day") +
scale_y_continuous(label=timeHMS_formatter, breaks=seq(0, 86400, 14400)) + # adds tick marks every 4 hours
scale_x_continuous(label=dateEPOCH_formatter, breaks=seq(min(mydata$day), max(mydata$day), 365) ) +
ggtitle("Email Sending Times") + # adds graph title
theme( legend.position = "none", axis.title.x = element_text(vjust=-0.3)) +
theme_bw() +
layer(
data=mydata,
mapping=aes(x=mydata$day, y=mydata$seconds),
stat="identity",
stat_params=list(),
geom="point",
geom_params=list(alpha=5/8, size=2, color="#A9203E"),
position=position_identity(),
)
print(p)
dev.off()
I think it will be much easier to use the built in function scale_x_date with date_format and date_breaks from the scales package. These should work with most date classes in R, such as Date, chron etc
for example
library(ggplot2)
library(chron)
library(scales)
# some example data
days <- seq(as.Date('01-01-2000', format = '%d-%m-%Y'),
as.Date('01-01-2010', format = '%d-%m-%Y'), by = 1)
days_chron <- as.chron(days)
mydata <- data.frame(day = days_chron, y = rnorm(length(days)))
# the plot
ggplot(mydata, aes(x=days, y= y)) + geom_point() +
scale_x_date(breaks = date_breaks('year'), labels = date_format('%Y'))
To show how intuitive and easy these function are, if you wanted Montth-year labels every 6 months - note that this requires a very wide plot or very small axis labels
ggplot(mydata, aes(x=days, y= y)) + geom_point() +
scale_x_date(breaks = date_breaks('6 months'), labels = date_format('%b-%Y'))

Resources