R - Datetimes with ggplot - r

What is the correct way to deal with datetimes in ggplot ?
I have data at several different dates and I would like to facet each date by the same time of day, e.g. between 1:30PM and 1:35PM, and plot the points between this time frame, how can I achieve this?
My data looks like:
datetime col1
2015-01-02 00:00:01 20
... ...
2015-01-02 11:59:59 34
2015-02-19 00:00:03 12
... ...
2015-02-19 11:59:58 27
I find myself often wanting to ggplot time series using datetime objects as the x-axis but I don't know how to use times only when dates aren't of interest.

The lubridate package will do the trick. There are commands you could use, specifically floor_date or ceiling_date to transform your datetime array.

I always use the chron package for times. It completely disregards dates and stores your time numerically (e.g. 1:30PM is stored as 13.5 because it's 13.5 hours into the day). That allows you to perform math on times, which is great for a lot of reasons, including calculating average time, the time between two points, etc.
For specific help with your plot you'll need to share a sample data frame in an easily copy-able format, and show the code you've tried so far.
This is a question I'd asked previously regarding the chron package, and it also gives an idea of how to share your data/ask a question that's easier for folks to reproduce and therefore answer:
Clear labeling of times class data on horizontal barplot/geom_segment

Related

Average of Data in accordance with similar time

I have a dataset having solar power generation for 24 hours for many days, now I have to find the average of the power generated in accordance with the time, as for example, Have a glimpse of the datasetI have to find the average of the power generated at time 9:00:00 AM.
Start by stripping out the time from the date-time variable.
Assuming your data is called myData
library(lubridate)
myData$Hour <- hour(strptime(myData$Time, format = "%Y-%m-%d %H:%M:%S"))
Then use ddply from the plyr package, which allows us to apply a function to a subset of data.
myMeans <- ddply(myData[,c("Hour", "IT_solar_generation")], "Hour", numcolwise(mean))
The resulting frame will have one column called Time which will give you the hour, and another with the means at each hour.
NOW, on another side but important note, when you ask a question you should be providing information on the attempts you've made so far to answer the question. This isn't a help desk.

r time interval plot

time count
2017-03-08 19:33 1
2017-03-23 22:11 1
2017-03-30 3:30 10
2017-03-09 19:33 13
2017-03-23 22:11 1
2017-03-31 3:30 1
.....
this data is about how fast consumers comments write
so I want to make a plot which I can easily know about how fast comments on.
For example,
In X axis, time series starts from 2017-03-08
through same interval(seconds or minute) there is a bar plot
so if the comments write speed is fast, the bar plot is dense.
and then time goes on, spped is not that fast, the bar plot is not dense
how can I make it?
cc5<-dt[, tdiff := difftime(cc, shift(cc, fill=cc[1L]), units="secs"),
by=title]
using this code, I can make difftime column
I have one more problem time column is character type
so I try to change it to date type using as.Date it doesn't work
so I change it to POSIXct type
I think to make X axis in time series I need to change date type
I'm not 100% sure that I'm really understanding the result that you want,
but generally when I want to put dates in the x-axis, I go to Understanding dates and plotting a histogram with ggplot2 in R
and use Gauden's Code v1. If you have successfully changed the character into a POSIXct time, as.Date() should work fine.

Resampling time series with xts and zoo packages in R

I am trying to resample a dataset with a given temporal resolution of 5 min (source). In order to get a 30 min resampled temporal resolution I've tried:
#Date and Time together
SRI_2010$Date_Time = paste(SRI_2010$Date, SRI_2010$Time, sep=" ")
SRI_2010$Date_Time=as.character(SRI_2010$Date_Time)
SRI_2010$Date_Time=as.POSIXct(SRI_2010$Date_Time,format="%d/%m/%Y %H:%M")
#Creating the zoo object
SRI_2010.zoo <- zoo(SRI_2010,as.POSIXct(SRI_2010$Date_Time))
#Criteria for the resampling
ends2010 <- endpoints(SRI_2010.zoo,'minutes', 30)
SRI_30m_2010 <-period.apply(SRI_2010.zoo$SRI..W.m2.,ends2010,mean)
At the very beginning, I was quite satisfied because the code worked out, but after a double-check, I've realised it calculates the mean values at min 25 and 55, instead of at min 00 and 30 that I am interested in.
Example:
> SRI_30m_2010
2010-07-28 04:55:00 2010-07-28 05:25:00
3.80000000 12.06666667
2010-07-28 05:55:00 2010-07-28 06:25:00
19.73333333 28.46666667
2010-07-28 06:55:00 2010-07-28 07:25:00
40.30000000 61.60000000
This small issue is super annoying when I aim to combine different datasets with different temporal resolutions into a communal one. Does anyone know how could I sort this issue out?
The "issue" is that endpoints is doing what it was designed to do. It's returning the last timestamp of each period. I recommend you use align.time to move the index timestamp forward to the minutes you're interested in.
s <- align.time(as.xts(SRI_30m_2010), 60*30)
It's also not much of an issue if you're trying to combine multiple series with different resolutions into a single xts object. You could just merge them all, use na.locf or similar to fill in missing values, then extract the resolution you're interested in. I believe the xts FAQ shows you how to do this, and I know I've demonstrated it more than a couple times in my other answers on stackoverflow.

how, in R, to plot time of day versus calendar date

I regret that I come to post here after being burned-out on hours of Internet searching regarding this simplistic question.
I have several data sets to plot in R, each consisting of two columns of data: time, date. I am using R 2.11.0 on a Windows computer, via the Rgui.
Time is "time of day" that an event is observed. As an example, it is recognized as:
Factor w/ 87 levels "5:53","5:54",..: 84 85 85 85 86 ...
Date is calendar date, recognized as:
Class 'Date' num [1:730] 13879 13880 13881 13882 13883 ...
The time values are recorded in the format of a 24-hr clock, h:mm or hh:mm. The date values are displayed yyyy-mm-dd.
I want to plot time (y-axis) vs. date (x-axis).
Using
plot(date,time)
gives an accurate-looking plot, but the y-axis is labeled as the numeric factor values (about 0 to 90), rather than the desired, temporally-ordered levels of the factor variable. The x-axis is labeled in the desired, human-readable format.
How can I correct this? Is there a "time of day" format in R that I can convert my "time" variable into? I will subsequently like to do arithmetic on the time values as well, and would not mind having to carry one column of values to use in plotting and one column of values for maths.
I ran across several examples online of manipulation of (date + time) variables in R, and converting those to different formats. I do not believe this is my problem, as I have separate fields for time and date and want to plot one against the other.
My thanks to you in advance for your suggestions, or your directions to a web-accessible resource (no appropriate libraries or bookstores at my location).
There may be an easier way to do this, but you can always label the y-axis yourself. Adjust the ticksAt vector below to find something that looks suitable for your data.
Data <- data.frame(date=Sys.Date()+1:10,time=paste(5,41:50,sep=":"))
with(Data, plot(date,time,yaxt="n"))
ticksAt <- c(1,3,5,7,9)
axis(2, at=ticksAt, labels=as.character(Data$time)[ticksAt])
?plot.zoo has some good examples of how to create pretty axis annotations, though some of them may be zoo-specific. ?par is also a good resource.
ts and timeSeries are two good choices.
Take a look at Related
Let's assume you have two vectors, one of Date class named "dt" and the other a factor named "tm":
x <- paste(as.character(dt[1:2]), as.character(tm))
strptime(x, "%Y-%m-%d %H:%M")
## [1] "2008-01-01 05:53:00" "2008-01-02 05:54:00"
class(strptime(x, "%Y-%m-%d %H:%M"))
## [1] "POSIXt" "POSIXlt"

R: Aggregating by dates with POSIXct?

I have some zoo series that use POSIXct index.
In order to aggregate by days I've tried these two ways:
aggregate(myzoo,format((index((myzoo)),"%Y-%m-%d")),sum)
aggregate(myzoo,as.Date(index(myzoo)),sum)
I don't know why they don't give the same output.
myzoo series had the weekends removed. The "as.Date way" seems to be OK but the "format way" aggregation gives me data on the weekends.
Why?
Which one is the right?
I've even tried it as.POSIXct(format(...))
As I mentioned in my comment, you need to be careful when changing the format of a timestamp that includes time with a time zone, because it can get shifted between days. Without any data, it's hard to say exactly what your problem is, but you might also try apply.daily from xts:
apply.daily(myzoo, sum)
Here's a working example:
> x <- zoo(2:20, as.POSIXct("2003-02-01") + (2:20) * 7200)
> apply.daily(x, sum)
> 2003-02-01 22:00:00 2003-02-02 16:00:00
65 144

Resources