ggplot2: how to graph only part of a time series - r

I would like to use ggplot to graph portions of time series data. For example, say I only wanted to graph the last five dates of this data. Is there away to specify this in ggplot without subsetting the data ahead of time? I tried using xlim, but it didn't work.
date <- c("2016-03-24","2016-03-25","2016-03-26","2016-03-27","2016-03-28",
"2016-03-29","2016-03-30","2016-03-31","2016-04-01","2016-04-02")
Temp <- c(35,34,92,42,21,47,37,42,63,12)
df <- data.frame(date,Temp)
My attempt:
ggplot(df) + geom_line(aes(x=date,y=Temp)) + xlim("2016-03-29","2016-04-02")
My dates are formatted as POSIXct.

You have to enter the xlim values as as.Date or as.POSIXct(). Is this what you want?
df$date <- as.Date(df$date, format= "%Y-%m-%d", tz = "UTC")
ggplot(df) + geom_line(aes(x=date,y=Temp)) +
xlim(as.Date(c("2016-03-30", "2016-04-02"), tz = "UTC", format = "%Y-%m-%d") )
PS: Be aware that you will get the following warning:
Warning message:
Removed 5 rows containing missing values (geom_path)

Related

Order ggplot geom_point in order by date on X axis

I have a table of maximum trip lengths by month which I am trying to graph in R ,
While trying to graph it, the X-axis does not graph according to the month, instead it graphs it alphabetically
I'm just getting started in R and I used the following code from what one of the videos I watched adjusted for my table names:
max_trips <- read.csv("max_and_min_trips.csv")
ggplot(data=max_trips)+
geom_point(mapping = aes(x=month,y=max_trip_duration))+
scale_x_month(month_labels = "%Y-%m")
The simple answer is that the data for your "month" column is stored as a vector of strings, not as a date. In R, this data type is called a "character" (or chr). You can confirm this by typing class(max_trips$month). The result is certainly "character" in your console. Therefore, your solution would be to (1) convert the data type to a date and (2) adjust the formatting of the date on the x axis using scale_x_date and/or related functions.
I'll demonstrate the process with a simple example dataset and plot. Here's the basic data frame and plot. You'll see, the plot is again arranged "alphabetically" instead of as expected if the mydf$dates values were stored as dates in "month/year" format.
library(lubridate)
mydf <- data.frame(
dates = c("1/21", "2/20", "12/21", "3/19", "10/19", "9/19"),
yvals = c(13, 31, 14, 10, 20, 18))
ggplot(mydf, aes(x = dates, y = yvals)) + geom_point()
Convert to Date
To convert to a date, you can use a few different functions, but I find the lubridate package particularly useful here. The as_date() function will be used for the conversion; however, we cannot just apply as_date() directly to mydf$dates or we will get the following error in the console:
> as_date(mydf$dates)
[1] NA NA NA NA NA NA
Warning message:
All formats failed to parse. No formats found.
Since there are so many variety of ways you can format data which correspond to dates, date times, etc, we need to specify that our data is in "month/year" format. The other key here is that data setup as a date must specify year, month and day. Our data here is just specifying month and year, so we will first need to add a random "day" to each date before converting. Here's something that works:
mydf$dates <- as_date(
paste0("1/", mydf$dates), # need to add a "day" to correctly format for date
format = "%d/%m/%y" # nomenclature from strptime()
)
The paste0(...) function serves to add "1/" before each value in mydf$dates and then the format = argument specifies the character values should be read as "day/month/year". For more information on the nomenclature for formats of dates, see the help for the strptime() function here.
Now our column is in date format:
> mydf$dates
[1] "2021-01-01" "2020-02-01" "2021-12-01" "2019-03-01" "2019-10-01" "2019-09-01"
> class(mydf$dates)
[1] "Date"
Changing Date Scale
When plotting now, the data is organized in the proper order along a date scale x axis.
p <- ggplot(mydf, aes(x = dates, y = yvals)) + geom_point()
p
If the labeling isn't quite what you are looking for, you may check the documentation here for the scale_x_date() function for some suggestions. The basic idea is to setup the arguments for breaks= in your scale and how they are labeled with date_labels=.
p + scale_x_date(breaks="4 months", date_labels = "%m/%y")
In the OP's case, I would suggest the following code should work:
library(lubridate)
max_trips <- read.csv("max_and_min_trips.csv")
max_trips$month <- as_date(
paste0("1/", max_trips$month),
format = "%d/%m/%y")
ggplot(data=max_trips)+
geom_point(mapping = aes(x=month,y=max_trip_duration))+
scale_x_date(breaks = "1 month", date_labels = "%Y-%m")

Parsing Time Data

I'm trying to bin time data into 1 min intervals throughout the day. Based on the following How to bin times from different days into time bins, I've managed to generate the breaks using the following:
breaks <- format(seq.POSIXt(from = strptime("00:00:00", format = "%H:%M:%S"), to = strptime("23:59:59", format = "%H:%M:%S"), by="1 min"), format = "%H:%M:%S")
However, once I try to use cut, NAS get coerced instead
data$bins <- cut(data$TIME, breaks = breaks)
Warning message:
In sort.int(as.double(breaks)) : NAs introduced by coercion
I've used the chron package to convert my data (hh:mm:ss format) into a time object and other operations seem to function properly.
Thanks!

Charting cumulative returns of a non time-based object

I have a vector of asset returns without dates in each row.
Is there a similar method as chart.CumReturns from package PerformanceAnalytics that does not require having to have a vector, dataframe etc. which is a time-based object (I do not have dates in rows).
If you want to keep all the functionality of chart.CumReturns and appearance of plots generated by the function, you may create fake dates, convert the vector to a format that chart.CumReturns accepts (e.g. xts or zoo), and then plot using chart.CumReturns with the fake x axis removed. It seems that chart.CumReturns does not handle order.by = index(x), thus you need a 'real' date.
library(PerformanceAnalytics)
library(xts)
# an example vector
vec <- coredata(edhec)[ , "Funds of Funds"]
# create fake dates, e.g.:
date <- seq(Sys.Date(), by = "1 month", length.out = length(vec))
# convert to xts (or zoo) object
xt <- xts(x = vec, order.by = date)
# plot without fake x axis
chart.CumReturns(xt, main = "Cumulative Returns", xaxis = FALSE)

Default Axis Format of PosixCt

Is there a way I can change the default format for how POSIXct labels appear when using plot and when they are part of a dataframe (Date HH:MM instead of just HH:MM)?
I would be nice if I could do this without having to issue an axis command each time or converting the dataframe to an xts object.
Answer goes to Vincent Zoonekynd.
You can use format argument in plot function to output the data in "%Y-%m-%d %H:%M" format.
Please see the code below:
df <- data.frame(
ms = c(10485849612, 10477641600, 10561104000, 10562745600),
value = 1:4
)
df$posix_time <- as.POSIXct(df$ms, origin = "1582-10-14", tz = "GMT")
plot(df$posix_time, df$value, format = "%Y-%m-%d %H:%M")
Output:

How to plot sunrise over a year in R?

I have date that looks like this:
"date", "sunrise"
2009-01-01, 05:31
2009-01-02, 05:31
2009-01-03, 05:33
2009-01-05, 05:34
....
2009-12-31, 05:29
and I want to plot this in R, with "date" as the x-axis, and "sunrise" as the y-axis.
You need to work a bit harder to get R to draw a suitable plot (i.e. get suitable axes). Say I have data similar to yours (here in a csv file for convenience:
"date","sunrise"
2009-01-01,05:31
2009-01-02,05:31
2009-01-03,05:33
2009-01-05,05:34
2009-01-06,05:35
2009-01-07,05:36
2009-01-08,05:37
2009-01-09,05:38
2009-01-10,05:39
2009-01-11,05:40
2009-01-12,05:40
2009-01-13,05:41
We can read the data in and format it appropriately so R knows the special nature of the data. The read.csv() call includes argument colClasses so R doesn't convert the dates/times into factors.
dat <- read.csv("foo.txt", colClasses = "character")
## Now convert the imported data to appropriate types
dat <- within(dat, {
date <- as.Date(date) ## no need for 'format' argument as data in correct format
sunrise <- as.POSIXct(sunrise, format = "%H:%M")
})
str(dat)
Now comes the slightly tricky bit as R gets the axes wrong (or perhaps better to say they aren't what we want) if you just do
plot(sunrise ~ date, data = dat)
## or
with(dat, plot(date, sunrise))
The first version gets both axes wrong, and the second can dispatch correctly on the dates so gets the x-axis correct, but the y-axis labels are not right.
So, suppress the plotting of the axes, and then add them yourself using axis.FOO functions where FOO is Date or POSIXct:
plot(sunrise ~ date, data = dat, axes = FALSE)
with(dat, axis.POSIXct(x = sunrise, side = 2, format = "%H:%M"))
with(dat, axis.Date(x = date, side = 1))
box() ## complete the plot frame
HTH
I think you can use the as.Date and as.POSIXct functions to convert the two columns in the proper format (the format parameter of as.POSIXct should be set to "%H:%M")
The standard plot function should then be able to deal with time and dates by itself

Resources